This model is a ChemBERTa model trained on the augmented_canonical_pubchem_13m dataset.
The model was trained for 24 epochs using NVIDIA Apex's FusedAdam optimizer with a reduce-on-plateau learning rate scheduler. To improve performance, mixed precision (fp16), TF32, and torch.compile were enabled. Training used gradient accumulation (16 steps) and batch size of 128 for efficient resource utilization. Evaluation was performed at regular intervals, with the best model selected based on validation performance.
- Downloads last month
- 1,313