glm2-multiepoch-restart-finetune

This model is a fine-tuned version of tattabio/gLM2_650M on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0192

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 3
  • total_train_batch_size: 6
  • total_eval_batch_size: 6
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss
1.3921 0.0300 500 1.3718
1.3769 0.0600 1000 1.3400
1.308 0.0900 1500 1.3196
1.3335 0.1200 2000 1.3049
1.2828 0.1500 2500 1.2896
1.274 0.1800 3000 1.2780
1.2775 0.2100 3500 1.2693
1.2406 0.2400 4000 1.2595
1.2564 0.2699 4500 1.2512
1.2059 0.2999 5000 1.2448
1.2546 0.3299 5500 1.2386
1.2012 0.3599 6000 1.2331
1.2101 0.3899 6500 1.2259
1.2179 0.4199 7000 1.2202
1.2289 0.4499 7500 1.2163
1.2467 0.4799 8000 1.2120
1.2331 0.5099 8500 1.2084
1.1877 0.5399 9000 1.2039
1.1995 0.5699 9500 1.2006
1.2188 0.5999 10000 1.1963
1.202 0.6299 10500 1.1932
1.2057 0.6599 11000 1.1896
1.2057 0.6899 11500 1.1862
1.1381 0.7199 12000 1.1854
1.2025 0.7499 12500 1.1814
1.1759 0.7798 13000 1.1786
1.1652 0.8098 13500 1.1770
1.172 0.8398 14000 1.1753
1.1481 0.8698 14500 1.1725
1.1717 0.8998 15000 1.1706
1.1855 0.9298 15500 1.1681
1.175 0.9598 16000 1.1657
1.1678 0.9898 16500 1.1630
1.1619 1.0198 17000 1.1620
1.1635 1.0498 17500 1.1611
1.156 1.0798 18000 1.1581
1.126 1.1098 18500 1.1572
1.1422 1.1398 19000 1.1556
1.1328 1.1698 19500 1.1543
1.1921 1.1998 20000 1.1516
1.11 1.2298 20500 1.1504
1.1327 1.2597 21000 1.1496
1.1495 1.2897 21500 1.1475
1.0984 1.3197 22000 1.1470
1.0884 1.3497 22500 1.1466
1.1816 1.3797 23000 1.1442
1.1313 1.4097 23500 1.1421
1.179 1.4397 24000 1.1416
1.1379 1.4697 24500 1.1402
1.0764 1.4997 25000 1.1396
1.1421 1.5297 25500 1.1375
1.1311 1.5597 26000 1.1369
1.1705 1.5897 26500 1.1355
1.1325 1.6197 27000 1.1343
1.1477 1.6497 27500 1.1330
1.1442 1.6797 28000 1.1323
1.1019 1.7097 28500 1.1306
1.124 1.7397 29000 1.1301
1.1338 1.7696 29500 1.1298
1.0975 1.7996 30000 1.1288
1.0954 1.8296 30500 1.1278
1.1178 1.8596 31000 1.1256
1.138 1.8896 31500 1.1258
1.1168 1.9196 32000 1.1252
1.112 1.9496 32500 1.1231
1.1189 1.9796 33000 1.1216
1.1296 2.0096 33500 1.1212
1.0995 2.0396 34000 1.1213
1.0987 2.0696 34500 1.1195
1.0958 2.0996 35000 1.1197
1.113 2.1296 35500 1.1183
1.1211 2.1596 36000 1.1182
1.0896 2.1896 36500 1.1162
1.1201 2.2196 37000 1.1165
1.1001 2.2496 37500 1.1144
1.081 2.2795 38000 1.1148
1.1327 2.3095 38500 1.1129
1.1288 2.3395 39000 1.1124
1.1121 2.3695 39500 1.1123
1.0992 2.3995 40000 1.1116
1.1054 2.4295 40500 1.1105
1.0676 2.4595 41000 1.1104
1.1037 2.4895 41500 1.1095
1.1117 2.5195 42000 1.1093
1.0669 2.5495 42500 1.1079
1.07 2.5795 43000 1.1081
1.0842 2.6095 43500 1.1066
1.0481 2.6395 44000 1.1064
1.1015 2.6695 44500 1.1062
1.0959 2.6995 45000 1.1046
1.0405 2.7295 45500 1.1040
1.0753 2.7594 46000 1.1027
1.0511 4.1839 46500 1.0966
1.0934 4.2289 47000 1.0940
1.0557 4.2739 47500 1.0931
1.0742 4.3189 48000 1.0939
1.1059 4.3639 48500 1.0921
1.0638 4.4089 49000 1.0912
1.0782 4.4538 49500 1.0908
1.0929 4.4988 50000 1.0895
1.0702 4.5438 50500 1.0886
1.0666 4.5888 51000 1.0871
1.0674 4.6338 51500 1.0868
1.0381 4.6788 52000 1.0851
1.0433 4.7238 52500 1.0849
1.0589 4.7688 53000 1.0828
1.0544 4.8137 53500 1.0835
1.0763 4.8587 54000 1.0835
1.0957 4.9037 54500 1.0815
1.0435 4.9487 55000 1.0808
1.0872 4.9937 55500 1.0802
1.0316 5.0387 56000 1.0800
1.0768 5.0837 56500 1.0806
1.0527 5.1287 57000 1.0789
1.0658 5.1737 57500 1.0777
1.0848 5.2186 58000 1.0787
1.0417 5.2636 58500 1.0780
1.0288 5.3086 59000 1.0772
1.0158 5.3536 59500 1.0774
1.0354 5.3986 60000 1.0754
1.0298 5.4436 60500 1.0740
1.0156 5.4886 61000 1.0738
1.0389 5.5336 61500 1.0749
1.0896 5.5785 62000 1.0736
1.0455 5.6235 62500 1.0725
1.0135 5.6685 63000 1.0723
1.0205 5.7135 63500 1.0721
1.0631 5.7585 64000 1.0713
1.0586 5.8035 64500 1.0699
1.0293 5.8485 65000 1.0702
1.0873 5.8935 65500 1.0687
1.0054 5.9385 66000 1.0681
1.0358 5.9834 66500 1.0683
1.0028 6.0284 67000 1.0686
1.0361 6.0734 67500 1.0682
1.0144 6.1184 68000 1.0675
1.0128 6.1634 68500 1.0665
1.0164 6.2084 69000 1.0661
1.0394 6.2534 69500 1.0660
1.0405 6.2984 70000 1.0651
1.0181 6.3434 70500 1.0650
0.9909 6.3883 71000 1.0646
1.0374 6.4333 71500 1.0644
1.0494 6.4783 72000 1.0629
1.0692 6.5233 72500 1.0627
1.0536 6.5683 73000 1.0632
1.0052 6.6133 73500 1.0614
1.0128 6.6583 74000 1.0613
1.0168 6.7033 74500 1.0618
1.0176 6.7482 75000 1.0605
1.0713 6.7932 75500 1.0609
0.9894 6.8382 76000 1.0598
1.0714 6.8832 76500 1.0591
0.9916 6.9282 77000 1.0587
1.0427 6.9732 77500 1.0584
1.014 7.0182 78000 1.0586
0.9905 7.0632 78500 1.0602
0.9802 7.1082 79000 1.0579
1.0112 7.1531 79500 1.0580
1.0417 7.1981 80000 1.0571
1.0238 7.2431 80500 1.0564
1.0127 7.2881 81000 1.0564
1.0181 7.3331 81500 1.0574
1.0049 7.3781 82000 1.0564
1.0327 7.4231 82500 1.0550
1.0129 7.4681 83000 1.0544
1.0392 7.5130 83500 1.0536
1.0037 7.5580 84000 1.0541
1.0018 7.6030 84500 1.0526
1.0108 7.6480 85000 1.0528
0.9891 7.6930 85500 1.0517
1.0211 7.7380 86000 1.0525
0.9783 7.7830 86500 1.0522
1.0081 7.8280 87000 1.0514
1.0585 7.8730 87500 1.0510
1.0002 7.9179 88000 1.0503
1.0012 7.9629 88500 1.0509
0.9855 8.0079 89000 1.0507
0.9962 8.0529 89500 1.0505
0.9852 8.0979 90000 1.0514
1.0041 8.1429 90500 1.0502
0.9919 8.1879 91000 1.0503
1.0142 8.2329 91500 1.0491
0.9744 8.2778 92000 1.0504
0.9894 8.3228 92500 1.0489
1.0068 8.3678 93000 1.0485
1.0 8.4128 93500 1.0484
0.9493 8.4578 94000 1.0485
0.9996 8.5028 94500 1.0479
0.9446 8.5478 95000 1.0480
0.9996 8.5928 95500 1.0468
0.9933 8.6378 96000 1.0473
1.0096 8.6827 96500 1.0460
1.0079 8.7277 97000 1.0450
1.0229 8.7727 97500 1.0459
0.9848 8.8177 98000 1.0445
0.9771 8.8627 98500 1.0454
0.9882 8.9077 99000 1.0439
0.9984 8.9527 99500 1.0442
0.9576 8.9977 100000 1.0439
1.0063 9.0426 100500 1.0452
0.9579 9.0876 101000 1.0444
0.9952 9.1326 101500 1.0446
0.9817 9.1776 102000 1.0435
0.9547 9.2226 102500 1.0441
0.9854 9.2676 103000 1.0446
0.9604 9.3126 103500 1.0435
1.0053 9.3576 104000 1.0415
0.9465 9.4026 104500 1.0432
0.9908 9.4475 105000 1.0416
1.018 9.4925 105500 1.0410
0.9796 9.5375 106000 1.0402
0.9596 9.5825 106500 1.0412
0.9694 9.6275 107000 1.0405
0.9907 9.6725 107500 1.0413
0.9729 9.7175 108000 1.0400
0.9518 9.7625 108500 1.0387
0.9678 9.8075 109000 1.0392
0.9764 9.8524 109500 1.0394
0.9729 9.8974 110000 1.0396
0.9797 9.9424 110500 1.0381
0.9911 9.9874 111000 1.0368
0.978 10.0324 111500 1.0390
0.9522 10.0774 112000 1.0403
0.9464 10.1224 112500 1.0393
0.9656 10.1674 113000 1.0388
0.9739 10.2123 113500 1.0395
0.9218 10.2573 114000 1.0377
0.9703 10.3023 114500 1.0374
0.9872 10.3473 115000 1.0381
0.9532 10.3923 115500 1.0380
0.9473 10.4373 116000 1.0360
0.9764 10.4823 116500 1.0365
0.9771 10.5273 117000 1.0363
0.9417 10.5723 117500 1.0375
0.9752 10.6172 118000 1.0361
0.9501 10.6622 118500 1.0358
0.9781 10.7072 119000 1.0354
0.9632 10.7522 119500 1.0352
1.0124 10.7972 120000 1.0339
0.9647 10.8422 120500 1.0335
0.9709 10.8872 121000 1.0337
1.0011 10.9322 121500 1.0334
0.9705 10.9771 122000 1.0333
0.9394 11.0221 122500 1.0354
0.9526 11.0671 123000 1.0354
0.9439 11.1121 123500 1.0347
0.9445 11.1571 124000 1.0348
0.9604 11.2021 124500 1.0347
0.9391 11.2471 125000 1.0345
1.0077 11.2921 125500 1.0350
0.9618 11.3371 126000 1.0338
0.9297 11.3820 126500 1.0324
0.9505 11.4270 127000 1.0319
0.974 11.4720 127500 1.0309
0.9476 11.5170 128000 1.0327
0.9143 11.5620 128500 1.0324
0.9504 11.6070 129000 1.0324
0.9471 11.6520 129500 1.0316
0.9588 11.6970 130000 1.0308
0.9331 11.7419 130500 1.0318
0.9561 11.7869 131000 1.0297
0.9535 11.8319 131500 1.0304
0.9434 11.8769 132000 1.0295
0.963 11.9219 132500 1.0287
0.9592 11.9669 133000 1.0292
0.9195 12.0119 133500 1.0306
0.9648 12.0569 134000 1.0310
0.9431 12.1019 134500 1.0306
0.9517 12.1468 135000 1.0311
0.9094 12.1918 135500 1.0297
0.9473 12.2368 136000 1.0308
0.9659 12.2818 136500 1.0302
0.9673 12.3268 137000 1.0295
0.9526 12.3718 137500 1.0294
0.9228 12.4168 138000 1.0296
0.9517 12.4618 138500 1.0295
0.9289 12.5067 139000 1.0284
0.9429 12.5517 139500 1.0295
0.9529 12.5967 140000 1.0270
0.9521 12.6417 140500 1.0287
0.9831 12.6867 141000 1.0272
0.9634 12.7317 141500 1.0267
0.9243 12.7767 142000 1.0267
0.9291 12.8217 142500 1.0278
0.9644 12.8667 143000 1.0261
0.8941 12.9116 143500 1.0263
0.938 12.9566 144000 1.0258
0.9277 13.0016 144500 1.0278
0.9202 13.0466 145000 1.0276
0.9337 13.0916 145500 1.0279
0.9224 13.1366 146000 1.0277
0.9795 13.1816 146500 1.0269
0.8956 13.2266 147000 1.0272
0.9434 13.2715 147500 1.0267
0.9695 13.3165 148000 1.0256
0.9161 13.3615 148500 1.0265
0.9375 13.4065 149000 1.0260
0.9364 13.4515 149500 1.0266
0.9189 13.4965 150000 1.0261
0.9729 13.5415 150500 1.0247
0.9188 13.5865 151000 1.0258
0.9576 13.6315 151500 1.0259
0.9244 13.6764 152000 1.0259
0.9509 13.7214 152500 1.0244
0.9387 13.7664 153000 1.0251
0.9243 13.8114 153500 1.0241
0.9056 13.8564 154000 1.0245
0.9043 13.9014 154500 1.0232
0.9014 13.9464 155000 1.0239
0.9564 13.9914 155500 1.0226
0.9113 14.0364 156000 1.0247
0.9295 14.0813 156500 1.0257
0.9084 14.1263 157000 1.0252
0.9076 14.1713 157500 1.0244
0.8789 14.2163 158000 1.0250
0.9204 14.2613 158500 1.0243
0.9123 14.3063 159000 1.0247
0.9157 14.3513 159500 1.0244
0.9014 14.3963 160000 1.0250
0.9735 14.4412 160500 1.0242
0.9389 14.4862 161000 1.0248
0.9458 14.5312 161500 1.0230
0.9334 14.5762 162000 1.0246
0.9303 14.6212 162500 1.0220
0.9147 14.6662 163000 1.0235
0.9156 14.7112 163500 1.0224
0.908 14.7562 164000 1.0216
0.8968 14.8012 164500 1.0217
0.8959 14.8461 165000 1.0231
0.917 14.8911 165500 1.0214
0.8793 14.9361 166000 1.0223
0.9076 14.9811 166500 1.0209
0.9004 15.0261 167000 1.0239
0.8836 15.0711 167500 1.0241
0.9224 15.1161 168000 1.0230
0.887 15.1611 168500 1.0224
0.915 15.2060 169000 1.0233
0.9147 15.2510 169500 1.0232
0.9003 15.2960 170000 1.0230
0.8909 15.3410 170500 1.0219
0.9536 15.3860 171000 1.0231
0.8724 15.4310 171500 1.0205
0.9108 15.4760 172000 1.0215
0.9477 15.5210 172500 1.0220
0.9139 15.5660 173000 1.0225
0.8863 15.6109 173500 1.0210
0.9205 15.6559 174000 1.0221
0.8948 15.7009 174500 1.0210
0.9151 15.7459 175000 1.0218
0.912 15.7909 175500 1.0208
0.9242 15.8359 176000 1.0200
0.9112 15.8809 176500 1.0207
0.9077 15.9259 177000 1.0200
0.9378 15.9708 177500 1.0189
0.8704 16.0158 178000 1.0221
0.8958 16.0608 178500 1.0215
0.8812 16.1058 179000 1.0228
0.9109 16.1508 179500 1.0219
0.8937 16.1958 180000 1.0223
0.9146 16.2408 180500 1.0217
0.9129 16.2858 181000 1.0227
0.8659 16.3308 181500 1.0201
0.9195 16.3757 182000 1.0224
0.8898 16.4207 182500 1.0218
0.8959 16.4657 183000 1.0201
0.9082 16.5107 183500 1.0206
0.8965 16.5557 184000 1.0206
0.8872 16.6007 184500 1.0208
0.8739 16.6457 185000 1.0198
0.8899 16.6907 185500 1.0198
0.9077 16.7356 186000 1.0190
0.8721 16.7806 186500 1.0194
0.8869 16.8256 187000 1.0196
0.9246 16.8706 187500 1.0195
0.9082 16.9156 188000 1.0190
0.9105 16.9606 188500 1.0180
0.8791 17.0056 189000 1.0204
0.8779 17.0506 189500 1.0202
0.8951 17.0956 190000 1.0209
0.88 17.1405 190500 1.0202
0.8846 17.1855 191000 1.0205
0.9008 17.2305 191500 1.0209
0.8934 17.2755 192000 1.0225
0.8832 17.3205 192500 1.0208
0.8855 17.3655 193000 1.0210
0.8948 17.4105 193500 1.0206
0.8983 17.4555 194000 1.0196
0.8758 17.5004 194500 1.0203
0.8894 17.5454 195000 1.0210
0.8877 17.5904 195500 1.0195
0.8814 17.6354 196000 1.0193
0.87 17.6804 196500 1.0203
0.89 17.7254 197000 1.0201
0.905 17.7704 197500 1.0185
0.8633 17.8154 198000 1.0191
0.8967 17.8604 198500 1.0189
0.9045 17.9053 199000 1.0190
0.8935 17.9503 199500 1.0182
0.887 17.9953 200000 1.0190
0.8905 18.0403 200500 1.0213
0.9219 18.0853 201000 1.0215
0.8838 18.1303 201500 1.0199
0.8942 18.1753 202000 1.0217
0.8973 18.2203 202500 1.0200
0.8832 18.2653 203000 1.0208
0.8891 18.3102 203500 1.0202
0.8851 18.3552 204000 1.0208
0.895 18.4002 204500 1.0198
0.8595 18.4452 205000 1.0196
0.8856 18.4902 205500 1.0196
0.8785 18.5352 206000 1.0205
0.911 18.5802 206500 1.0196
0.8647 18.6252 207000 1.0193
0.8728 18.6701 207500 1.0185
0.865 18.7151 208000 1.0194
0.8469 18.7601 208500 1.0196
0.9129 18.8051 209000 1.0203
0.841 18.8501 209500 1.0179
0.8875 18.8951 210000 1.0191
0.8451 18.9401 210500 1.0181
0.8708 18.9851 211000 1.0193
0.9189 19.0301 211500 1.0200
0.8352 19.0750 212000 1.0202
0.8503 19.1200 212500 1.0192
0.8602 19.1650 213000 1.0203
0.8637 19.2100 213500 1.0195
0.8899 19.2550 214000 1.0196
0.892 19.3000 214500 1.0203
0.8654 19.3450 215000 1.0195
0.8998 19.3900 215500 1.0205
0.9005 19.4349 216000 1.0196
0.8659 19.4799 216500 1.0203
0.8641 19.5249 217000 1.0194
0.8776 19.5699 217500 1.0193
0.8616 19.6149 218000 1.0192
0.8758 19.6599 218500 1.0191
0.879 19.7049 219000 1.0190
0.8591 19.7499 219500 1.0199
0.8919 19.7949 220000 1.0196
0.8637 19.8398 220500 1.0201
0.9084 19.8848 221000 1.0195
0.8651 19.9298 221500 1.0189
0.8676 19.9748 222000 1.0192

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.7.0+cu126
  • Datasets 3.5.1
  • Tokenizers 0.21.1
Downloads last month
3
Safetensors
Model size
671M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ishanjmukherjee/glm2-multiepoch-restart-finetune

Base model

tattabio/gLM2_650M
Finetuned
(4)
this model