train_record_1745950249

This model is a fine-tuned version of google/gemma-3-1b-it on the record dataset. It achieves the following results on the evaluation set:

  • Loss: 15.1265
  • Num Input Tokens Seen: 55002224

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
12.9802 0.0064 200 15.4294 277264
15.5653 0.0128 400 15.3182 548976
14.8651 0.0192 600 15.2347 826016
15.0577 0.0256 800 15.2450 1099968
14.3819 0.0320 1000 15.2481 1374672
14.8614 0.0384 1200 15.2286 1647936
14.6905 0.0448 1400 15.2245 1921648
14.7451 0.0512 1600 15.2363 2194448
12.4446 0.0576 1800 15.2256 2472048
14.8964 0.0640 2000 15.2178 2746752
16.8131 0.0704 2200 15.1959 3020144
13.821 0.0768 2400 15.2336 3296624
14.8777 0.0832 2600 15.2111 3571808
14.2003 0.0896 2800 15.2043 3847184
14.7555 0.0960 3000 15.1955 4121024
12.9845 0.1024 3200 15.1956 4396880
15.1686 0.1088 3400 15.2007 4671152
14.952 0.1152 3600 15.1925 4950800
15.3092 0.1216 3800 15.1615 5228512
14.5589 0.1280 4000 15.1821 5504608
16.1102 0.1344 4200 15.1833 5778176
16.944 0.1408 4400 15.1632 6055712
16.6215 0.1472 4600 15.1717 6331680
13.1747 0.1536 4800 15.1675 6604544
14.1834 0.1600 5000 15.1918 6882256
14.1643 0.1664 5200 15.1407 7159072
16.262 0.1728 5400 15.1633 7433136
15.0187 0.1792 5600 15.1981 7707776
15.4275 0.1856 5800 15.2009 7985472
14.5093 0.1920 6000 15.1784 8259552
15.5713 0.1985 6200 15.1697 8535952
15.0231 0.2049 6400 15.1983 8809968
13.2381 0.2113 6600 15.1305 9084016
15.0543 0.2177 6800 15.1265 9357456
14.2839 0.2241 7000 15.1470 9630608
14.1731 0.2305 7200 15.1426 9907888
14.4535 0.2369 7400 15.1830 10182048
16.5245 0.2433 7600 15.1353 10458544
14.0598 0.2497 7800 15.1353 10736144
15.952 0.2561 8000 15.1630 11010512
13.0096 0.2625 8200 15.1618 11284128
15.3546 0.2689 8400 15.1654 11556816
16.3646 0.2753 8600 15.1641 11828816
14.7599 0.2817 8800 15.1585 12104176
13.1527 0.2881 9000 15.1686 12378784
14.0753 0.2945 9200 15.1715 12654368
15.9636 0.3009 9400 15.1761 12927088
15.208 0.3073 9600 15.1809 13199552
14.3356 0.3137 9800 15.1842 13473952
15.6574 0.3201 10000 15.1568 13750288
14.857 0.3265 10200 15.1562 14025248
15.7608 0.3329 10400 15.1558 14300160
16.2262 0.3393 10600 15.1596 14577760
15.392 0.3457 10800 15.1578 14851280
13.7857 0.3521 11000 15.1574 15125104
16.3688 0.3585 11200 15.1550 15398624
15.6465 0.3649 11400 15.1580 15672384
16.6331 0.3713 11600 15.1562 15946384
14.5599 0.3777 11800 15.1567 16220112
14.0347 0.3841 12000 15.1537 16493920
15.4296 0.3905 12200 15.1600 16771376
16.117 0.3969 12400 15.1438 17046656
15.5492 0.4033 12600 15.1435 17318272
15.4808 0.4097 12800 15.1435 17591696
15.5101 0.4161 13000 15.1441 17864256
14.4989 0.4225 13200 15.1461 18137984
13.6671 0.4289 13400 15.1443 18413504
14.6588 0.4353 13600 15.1439 18690528
17.4437 0.4417 13800 15.1473 18966352
14.0331 0.4481 14000 15.1442 19242160
13.0993 0.4545 14200 15.1441 19518832
13.6133 0.4609 14400 15.1444 19795920
15.4418 0.4673 14600 15.1433 20073168
13.4136 0.4737 14800 15.1449 20349056
14.7192 0.4801 15000 15.1442 20622896
15.6836 0.4865 15200 15.1551 20896768
13.27 0.4929 15400 15.1552 21171376
14.9471 0.4993 15600 15.1545 21447568
16.4186 0.5057 15800 15.1562 21722256
14.1038 0.5121 16000 15.1550 21998320
15.4738 0.5185 16200 15.1552 22273616
13.9429 0.5249 16400 15.1555 22549280
13.12 0.5313 16600 15.1553 22823984
14.7638 0.5377 16800 15.1552 23098384
15.258 0.5441 17000 15.1557 23371136
14.6926 0.5505 17200 15.1556 23647856
15.8429 0.5569 17400 15.1555 23921008
15.6753 0.5633 17600 15.1550 24194480
16.3745 0.5697 17800 15.1500 24469312
16.2398 0.5761 18000 15.1560 24743360
15.6079 0.5825 18200 15.1588 25020352
15.0036 0.5890 18400 15.1589 25295920
15.0048 0.5954 18600 15.1587 25571232
15.138 0.6018 18800 15.1586 25847664
15.5366 0.6082 19000 15.1587 26125328
13.933 0.6146 19200 15.1587 26404064
14.128 0.6210 19400 15.1587 26677504
15.8011 0.6274 19600 15.1587 26952544
15.3115 0.6338 19800 15.1588 27226896
14.7545 0.6402 20000 15.1588 27501216
14.6687 0.6466 20200 15.1588 27776624
15.6931 0.6530 20400 15.1588 28051872
13.5465 0.6594 20600 15.1588 28325632
13.7558 0.6658 20800 15.1588 28598784
15.2542 0.6722 21000 15.1588 28874800
16.6519 0.6786 21200 15.1597 29151312
17.4344 0.6850 21400 15.1596 29425936
12.8174 0.6914 21600 15.1603 29702784
15.1029 0.6978 21800 15.1595 29979824
14.3423 0.7042 22000 15.1599 30256128
15.6432 0.7106 22200 15.1586 30528032
14.3856 0.7170 22400 15.1591 30803904
15.1348 0.7234 22600 15.1591 31077632
15.322 0.7298 22800 15.1594 31354544
14.2641 0.7362 23000 15.1594 31626736
16.7093 0.7426 23200 15.1601 31901472
16.5769 0.7490 23400 15.1601 32179968
15.7483 0.7554 23600 15.1596 32457728
15.3673 0.7618 23800 15.1589 32732288
13.9073 0.7682 24000 15.1598 33007504
14.3764 0.7746 24200 15.1597 33281968
14.7825 0.7810 24400 15.1586 33558736
14.8331 0.7874 24600 15.1586 33830832
14.3912 0.7938 24800 15.1594 34104944
14.5176 0.8002 25000 15.1601 34381536
15.3626 0.8066 25200 15.1596 34654672
15.0243 0.8130 25400 15.1601 34931520
15.9599 0.8194 25600 15.1601 35206448
14.9095 0.8258 25800 15.1596 35482800
16.1984 0.8322 26000 15.1599 35756816
15.5676 0.8386 26200 15.1601 36031296
15.3677 0.8450 26400 15.1601 36307968
16.4824 0.8514 26600 15.1601 36580432
14.258 0.8578 26800 15.1596 36855328
15.7652 0.8642 27000 15.1601 37133072
14.3243 0.8706 27200 15.1601 37404464
15.1033 0.8770 27400 15.1601 37675456
14.5413 0.8834 27600 15.1601 37951616
13.5684 0.8898 27800 15.1593 38225840
15.5201 0.8962 28000 15.1593 38498736
15.159 0.9026 28200 15.1593 38771760
16.9553 0.9090 28400 15.1597 39045824
12.5237 0.9154 28600 15.1597 39320736
14.7548 0.9218 28800 15.1597 39594816
14.4744 0.9282 29000 15.1597 39870432
16.6983 0.9346 29200 15.1597 40144672
13.4943 0.9410 29400 15.1597 40420752
15.7357 0.9474 29600 15.1597 40696672
13.7294 0.9538 29800 15.1597 40970096
14.1421 0.9602 30000 15.1597 41245904
15.3599 0.9666 30200 15.1597 41519232
14.5299 0.9730 30400 15.1597 41791520
15.7709 0.9795 30600 15.1597 42066928
14.3146 0.9859 30800 15.1597 42339616
15.0257 0.9923 31000 15.1597 42616352
15.7705 0.9987 31200 15.1597 42892688
15.5756 1.0051 31400 15.1597 43167792
15.867 1.0115 31600 15.1597 43444592
13.7023 1.0179 31800 15.1597 43719328
13.9696 1.0243 32000 15.1597 43994064
14.3737 1.0307 32200 15.1597 44269712
14.9996 1.0371 32400 15.1597 44545408
12.366 1.0435 32600 15.1597 44819808
14.4042 1.0499 32800 15.1597 45097904
15.2717 1.0563 33000 15.1597 45376272
14.2669 1.0627 33200 15.1597 45647824
14.3355 1.0691 33400 15.1597 45922032
14.4671 1.0755 33600 15.1597 46197840
13.329 1.0819 33800 15.1597 46474848
15.1515 1.0883 34000 15.1597 46749824
13.4913 1.0947 34200 15.1597 47023856
15.6598 1.1011 34400 15.1597 47301520
14.2071 1.1075 34600 15.1597 47574864
16.7937 1.1139 34800 15.1597 47853888
14.6934 1.1203 35000 15.1597 48129792
15.649 1.1267 35200 15.1597 48405024
13.6601 1.1331 35400 15.1597 48678592
14.6172 1.1395 35600 15.1597 48954048
15.9964 1.1459 35800 15.1597 49232480
15.4332 1.1523 36000 15.1597 49505040
14.3303 1.1587 36200 15.1597 49778864
15.196 1.1651 36400 15.1597 50051632
14.1885 1.1715 36600 15.1597 50325888
13.9557 1.1779 36800 15.1597 50601136
14.7712 1.1843 37000 15.1597 50876992
14.3635 1.1907 37200 15.1597 51153296
16.9773 1.1971 37400 15.1597 51427552
17.6143 1.2035 37600 15.1597 51707088
13.8467 1.2099 37800 15.1597 51981712
14.8764 1.2163 38000 15.1597 52254352
16.6284 1.2227 38200 15.1597 52529584
14.2186 1.2291 38400 15.1597 52803776
15.0678 1.2355 38600 15.1597 53078736
14.9189 1.2419 38800 15.1597 53352672
13.976 1.2483 39000 15.1597 53628768
13.8707 1.2547 39200 15.1597 53905216
13.4688 1.2611 39400 15.1597 54178832
13.2421 1.2675 39600 15.1597 54454880
14.5777 1.2739 39800 15.1597 54727600
16.7093 1.2803 40000 15.1597 55002224

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_record_1745950249

Adapter
(139)
this model