jaykmr
/

ESMCrystal_t6_8M_v1

Text Classification

Model card Files Files and versions Community

Label Semantics:

Label 0: Non-crystallizable (Negative)

Label 1: Crystallizable (Positive)

Dataset

Model

ESMCrystal_t6_8M_v1

ESMCrystal_t6_8M_v1 is a state-of-the-art protein crystallization prediction model finetuned on esm2_t6_8M_UR50D, having 6 layers and 8M parameters with the size of approx. 31.4MB using transfer learning to predict whether an input protein sequence will crystallize or not.

Accuracy :

Dataset	Accuracy
DeepCrystal Test	0.7913593256059009
BCrystal test	0.7811975377728035
SP test	0.6962025316455697
TR test	0.8191699604743083

Comparision Table:

	Count	Positives	Negatives	TP	FP	FN	TN	Precision	Recall	F1	Accuracy	ROC	Mathew's Coefficient	PPV	NPV

Test	1898	898	1000	532	362	34	966	0.5950783	0.93992933	0.72876712	0.79091869	0.9467	0.611906376	0.5950783	0.966

BCrystal Test	1787	891	896	531	360	31	865	0.5959596	0.94483986	0.73090158	0.78119754	0.9396	0.604504011	0.5959596	0.96540179

SP Test	237	148	89	80	68	4	85	0.54054054	0.95238095	0.68965517	0.69620253	0.9328	0.501728679	0.54054054	0.95505618

TR Test	1012	374	638	207	167	16	622	0.55347594	0.92825112	0.69346734	0.81916996	0.9562	0.615341231	0.55347594	0.97492163

Graphs

ROC-AUC Curve

DeepCrystal Test
BCrystal Test
SP Test
TR Test

PR-AUC Curve

DeepCrystal Test
BCrystal Test
SP Test
TR Test

Final scores :

on DeepCrystal test:

	precision	recall	f1-score	support
non-crystallizable	0.73	0.97	0.83	1000
crystallizable	0.94	0.60	0.73	898
accuracy			0.79	1898
macro avg	0.83	0.78	0.78	1898
weighted avg	0.83	0.79	0.78	1898

on BCrystal test:

	precision	recall	f1-score	support
non-crystallizable	0.71	0.97	0.82	896
crystallizable	0.94	0.60	0.73	891
accuracy			0.78	1787
macro avg	0.83	0.78	0.77	1787
weighted avg	0.83	0.78	0.77	1787

on SP test:

	precision	recall	f1-score	support
non-crystallizable	0.56	0.96	0.70	89
crystallizable	0.95	0.54	0.69	148
accuracy			0.70	237
macro avg	0.75	0.75	0.70	237
weighted avg	0.80	0.70	0.69	237

on TR test:

	precision	recall	f1-score	support
non-crystallizable	0.79	0.97	0.87	638
crystallizable	0.93	0.55	0.69	374
accuracy			0.82	1012
macro avg	0.86	0.76	0.78	1012
weighted avg	0.84	0.82	0.81	1012

Confusion matrix:

on DeepCrystal test:

    | 532 | 362 |
    |  34 | 966 |

on BCrystal test:

    | 531 | 360 |
    |  31 | 865 |

on SP test:

    | 80 | 68 |
    |  4 | 85 |

on TR test:

   | 207 | 167 |
   |  16 | 622 |

Metrics

roc score:

on DeepCrystal test: 0.9467594654788418
on BCrystal test: 0.946546316337983
on SP test: 0.9328120255086547
on TR test: 0.9562804888270497

Mathews Coefficient:

on DeepCrystal test: 0.6130826598876417
on BCrystal test: 0.6045040114572474
on SP test: 0.5017286791304684
on TR test: 0.6153412305503776

NPV:

on DeepCrystal test: 0.966
on BCrystal test: 0.9654017857142857
on SP test: 0.9550561797752809
on TR test: 0.9749216300940439

PPV:

on DeepCrystal test: 0.5968819599109132
on BCrystal test: 0.5959595959595959
on SP test: 0.5405405405405406
on TR test: 0.553475935828877

Researchers:

Credits:

Downloads last month: 5

Safetensors

Model size

7.84M params

Tensor type

I64

·

F32

·