gLM2-1protonly-8epoch-finetune

This model is a fine-tuned version of tattabio/gLM2_650M on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9922

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 8
  • total_eval_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
1.231 0.0529 500 1.2229
1.1912 0.1059 1000 1.1914
1.1681 0.1588 1500 1.1674
1.1636 0.2118 2000 1.1496
1.1281 0.2647 2500 1.1372
1.1213 0.3177 3000 1.1247
1.1152 0.3706 3500 1.1150
1.1094 0.4235 4000 1.1050
1.0952 0.4765 4500 1.0973
1.0907 0.5294 5000 1.0903
1.0912 0.5824 5500 1.0847
1.0886 0.6353 6000 1.0795
1.0764 0.6883 6500 1.0742
1.0715 0.7412 7000 1.0686
1.0771 0.7942 7500 1.0637
1.0806 0.8471 8000 1.0604
1.0493 0.9000 8500 1.0563
1.0569 0.9530 9000 1.0522
1.0416 1.0059 9500 1.0504
1.0382 1.0589 10000 1.0479
1.0444 1.1118 10500 1.0446
1.0642 1.1648 11000 1.0427
1.025 1.2177 11500 1.0384
1.0265 1.2706 12000 1.0366
1.0307 1.3236 12500 1.0338
1.0289 1.3765 13000 1.0309
1.0071 1.4295 13500 1.0291
1.032 1.4824 14000 1.0276
1.0286 1.5354 14500 1.0241
1.0266 1.5883 15000 1.0222
1.0072 1.6413 15500 1.0206
1.0198 1.6942 16000 1.0194
1.0171 1.7471 16500 1.0172
1.007 1.8001 17000 1.0160
1.0175 1.8530 17500 1.0143
1.0265 1.9060 18000 1.0125
0.9966 1.9589 18500 1.0108
0.9973 2.0119 19000 1.0097
1.0099 2.0648 19500 1.0086
0.9914 2.1177 20000 1.0074
1.0189 2.1707 20500 1.0051
1.0053 2.2236 21000 1.0040
0.9951 2.2766 21500 1.0032
0.99 2.3295 22000 1.0014
0.9849 2.3825 22500 1.0007
0.9964 2.4354 23000 1.0000
0.9951 2.4884 23500 0.9986
0.9822 2.5413 24000 0.9972
0.988 2.5942 24500 0.9967
0.993 2.6472 25000 0.9955
0.9974 2.7001 25500 0.9942
0.9983 2.7531 26000 0.9937
0.9824 2.8060 26500 0.9938
0.9758 2.8590 27000 0.9927
0.9742 2.9119 27500 0.9928
0.9682 2.9648 28000 0.9922

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.7.0+cu126
  • Datasets 3.5.1
  • Tokenizers 0.21.1
Downloads last month
4
Safetensors
Model size
671M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ishanjmukherjee/gLM2-1protonly-8epoch-finetune

Base model

tattabio/gLM2_650M
Finetuned
(4)
this model