hazemessam/esm3_ddg_v2 · Model results

Hi,
It's expected to perform poorly, the dataset that it was trained on contains few examples. I collected more datasets with more examples, but unfortunately, I did not have enough time to train the model on them.

Here are the datasets:
https://huggingface.co/datasets/hazemessam/abyssal_db
https://huggingface.co/datasets/hazemessam/ddg_megadataset (largest dataset)
https://huggingface.co/datasets/hazemessam/prostata
https://huggingface.co/datasets/hazemessam/ddg (this was the one I used to train the model, the model was only trained on ssym.csv)
https://huggingface.co/datasets/hazemessam/fireprot_db

If you would like to check the training script: https://github.com/hazemessamm/silica/blob/main/silica/stability_training.py

Note: If you are going to train the model on ddg_megadataset (or any dataset that already swaps the sequences), make sure to turn off the swap parameter in the SingleMutationDatasetV2 class because the examples in this dataset already swaps the examples.

Sorry for the late reply, and I hope I was able to explain everything clearly, if not let me know.