train_record_1745950246

This model is a fine-tuned version of google/gemma-3-1b-it on the record dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7299
  • Num Input Tokens Seen: 55002224

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
2.8233 0.0064 200 4.0758 277264
1.6949 0.0128 400 1.6427 548976
1.4877 0.0192 600 1.3227 826016
1.3337 0.0256 800 1.2206 1099968
1.2957 0.0320 1000 1.1480 1374672
1.0295 0.0384 1200 1.1003 1647936
1.0462 0.0448 1400 1.0607 1921648
0.8464 0.0512 1600 1.0352 2194448
0.8272 0.0576 1800 1.0096 2472048
1.1763 0.0640 2000 0.9865 2746752
1.3931 0.0704 2200 0.9697 3020144
1.2572 0.0768 2400 0.9548 3296624
1.0349 0.0832 2600 0.9399 3571808
1.1218 0.0896 2800 0.9313 3847184
0.8946 0.0960 3000 0.9185 4121024
0.7198 0.1024 3200 0.9094 4396880
0.8046 0.1088 3400 0.9014 4671152
0.8053 0.1152 3600 0.8916 4950800
0.9251 0.1216 3800 0.8846 5228512
0.845 0.1280 4000 0.8784 5504608
0.9929 0.1344 4200 0.8734 5778176
0.7947 0.1408 4400 0.8657 6055712
0.6828 0.1472 4600 0.8624 6331680
0.7723 0.1536 4800 0.8569 6604544
0.8328 0.1600 5000 0.8511 6882256
0.8425 0.1664 5200 0.8481 7159072
0.7358 0.1728 5400 0.8421 7433136
0.9727 0.1792 5600 0.8378 7707776
0.6727 0.1856 5800 0.8367 7985472
0.7569 0.1920 6000 0.8327 8259552
1.0037 0.1985 6200 0.8338 8535952
0.661 0.2049 6400 0.8256 8809968
0.9558 0.2113 6600 0.8226 9084016
0.869 0.2177 6800 0.8198 9357456
0.6787 0.2241 7000 0.8197 9630608
0.6022 0.2305 7200 0.8149 9907888
0.6105 0.2369 7400 0.8124 10182048
1.0618 0.2433 7600 0.8104 10458544
0.7117 0.2497 7800 0.8075 10736144
0.8245 0.2561 8000 0.8047 11010512
0.6891 0.2625 8200 0.8023 11284128
0.8266 0.2689 8400 0.8010 11556816
0.9607 0.2753 8600 0.7989 11828816
0.6199 0.2817 8800 0.7967 12104176
0.733 0.2881 9000 0.7948 12378784
0.7747 0.2945 9200 0.7927 12654368
0.6677 0.3009 9400 0.7914 12927088
0.9276 0.3073 9600 0.7888 13199552
0.8887 0.3137 9800 0.7877 13473952
0.7798 0.3201 10000 0.7864 13750288
0.7029 0.3265 10200 0.7865 14025248
0.7353 0.3329 10400 0.7841 14300160
0.6306 0.3393 10600 0.7833 14577760
0.7592 0.3457 10800 0.7805 14851280
0.588 0.3521 11000 0.7808 15125104
0.7482 0.3585 11200 0.7781 15398624
0.9051 0.3649 11400 0.7771 15672384
0.8379 0.3713 11600 0.7754 15946384
0.6137 0.3777 11800 0.7736 16220112
0.7876 0.3841 12000 0.7733 16493920
0.7736 0.3905 12200 0.7744 16771376
0.877 0.3969 12400 0.7717 17046656
0.6752 0.4033 12600 0.7706 17318272
0.6949 0.4097 12800 0.7698 17591696
0.7197 0.4161 13000 0.7689 17864256
0.6706 0.4225 13200 0.7704 18137984
0.8119 0.4289 13400 0.7684 18413504
0.8481 0.4353 13600 0.7665 18690528
0.8423 0.4417 13800 0.7652 18966352
0.8398 0.4481 14000 0.7640 19242160
0.7095 0.4545 14200 0.7629 19518832
0.7228 0.4609 14400 0.7624 19795920
0.4926 0.4673 14600 0.7622 20073168
0.4827 0.4737 14800 0.7614 20349056
0.9413 0.4801 15000 0.7603 20622896
0.83 0.4865 15200 0.7598 20896768
0.5562 0.4929 15400 0.7583 21171376
0.7659 0.4993 15600 0.7580 21447568
0.9216 0.5057 15800 0.7575 21722256
0.7743 0.5121 16000 0.7572 21998320
0.6469 0.5185 16200 0.7558 22273616
0.7036 0.5249 16400 0.7551 22549280
0.8155 0.5313 16600 0.7546 22823984
0.7568 0.5377 16800 0.7551 23098384
0.7415 0.5441 17000 0.7532 23371136
0.5823 0.5505 17200 0.7527 23647856
0.6466 0.5569 17400 0.7518 23921008
0.7807 0.5633 17600 0.7514 24194480
0.7174 0.5697 17800 0.7505 24469312
0.7576 0.5761 18000 0.7501 24743360
0.961 0.5825 18200 0.7497 25020352
0.9696 0.5890 18400 0.7490 25295920
0.6775 0.5954 18600 0.7484 25571232
0.8449 0.6018 18800 0.7478 25847664
0.8842 0.6082 19000 0.7478 26125328
0.7932 0.6146 19200 0.7467 26404064
0.6471 0.6210 19400 0.7466 26677504
0.8983 0.6274 19600 0.7469 26952544
0.7124 0.6338 19800 0.7465 27226896
0.7017 0.6402 20000 0.7465 27501216
0.6379 0.6466 20200 0.7448 27776624
0.6725 0.6530 20400 0.7443 28051872
0.6149 0.6594 20600 0.7431 28325632
0.9104 0.6658 20800 0.7431 28598784
0.7189 0.6722 21000 0.7426 28874800
0.6754 0.6786 21200 0.7428 29151312
1.0405 0.6850 21400 0.7422 29425936
0.8179 0.6914 21600 0.7416 29702784
0.9606 0.6978 21800 0.7411 29979824
0.7739 0.7042 22000 0.7404 30256128
0.6248 0.7106 22200 0.7403 30528032
0.7647 0.7170 22400 0.7410 30803904
0.828 0.7234 22600 0.7406 31077632
1.0034 0.7298 22800 0.7403 31354544
0.6809 0.7362 23000 0.7400 31626736
0.6563 0.7426 23200 0.7396 31901472
0.7809 0.7490 23400 0.7391 32179968
0.8411 0.7554 23600 0.7388 32457728
0.7451 0.7618 23800 0.7382 32732288
0.7342 0.7682 24000 0.7381 33007504
0.7153 0.7746 24200 0.7375 33281968
0.7706 0.7810 24400 0.7381 33558736
0.7155 0.7874 24600 0.7376 33830832
0.9201 0.7938 24800 0.7376 34104944
0.7624 0.8002 25000 0.7368 34381536
0.6255 0.8066 25200 0.7362 34654672
0.7843 0.8130 25400 0.7355 34931520
0.6743 0.8194 25600 0.7354 35206448
0.632 0.8258 25800 0.7356 35482800
0.8752 0.8322 26000 0.7352 35756816
0.6129 0.8386 26200 0.7356 36031296
0.7504 0.8450 26400 0.7348 36307968
0.9538 0.8514 26600 0.7345 36580432
0.7165 0.8578 26800 0.7340 36855328
0.8094 0.8642 27000 0.7339 37133072
0.7163 0.8706 27200 0.7340 37404464
0.7094 0.8770 27400 0.7335 37675456
0.6543 0.8834 27600 0.7332 37951616
0.6482 0.8898 27800 0.7332 38225840
0.7161 0.8962 28000 0.7330 38498736
0.6364 0.9026 28200 0.7328 38771760
0.7501 0.9090 28400 0.7328 39045824
0.505 0.9154 28600 0.7328 39320736
0.6496 0.9218 28800 0.7324 39594816
0.7856 0.9282 29000 0.7326 39870432
0.6598 0.9346 29200 0.7323 40144672
0.7067 0.9410 29400 0.7323 40420752
0.7768 0.9474 29600 0.7318 40696672
0.4495 0.9538 29800 0.7317 40970096
0.6259 0.9602 30000 0.7321 41245904
0.9303 0.9666 30200 0.7320 41519232
1.0491 0.9730 30400 0.7319 41791520
0.6816 0.9795 30600 0.7319 42066928
0.8694 0.9859 30800 0.7326 42339616
0.9029 0.9923 31000 0.7318 42616352
0.8713 0.9987 31200 0.7324 42892688
0.4775 1.0051 31400 0.7324 43167792
0.7046 1.0115 31600 0.7319 43444592
0.7093 1.0179 31800 0.7318 43719328
0.58 1.0243 32000 0.7312 43994064
0.7032 1.0307 32200 0.7314 44269712
0.7959 1.0371 32400 0.7309 44545408
0.5933 1.0435 32600 0.7311 44819808
0.6621 1.0499 32800 0.7309 45097904
0.7677 1.0563 33000 0.7308 45376272
0.7354 1.0627 33200 0.7307 45647824
0.7796 1.0691 33400 0.7308 45922032
0.7574 1.0755 33600 0.7306 46197840
0.8987 1.0819 33800 0.7303 46474848
0.6623 1.0883 34000 0.7307 46749824
0.8771 1.0947 34200 0.7308 47023856
0.6231 1.1011 34400 0.7305 47301520
0.6705 1.1075 34600 0.7306 47574864
1.0773 1.1139 34800 0.7307 47853888
0.5722 1.1203 35000 0.7305 48129792
0.7991 1.1267 35200 0.7307 48405024
0.7533 1.1331 35400 0.7306 48678592
0.7164 1.1395 35600 0.7302 48954048
0.6525 1.1459 35800 0.7302 49232480
0.7326 1.1523 36000 0.7303 49505040
0.6382 1.1587 36200 0.7303 49778864
0.7743 1.1651 36400 0.7299 50051632
0.5618 1.1715 36600 0.7302 50325888
0.5741 1.1779 36800 0.7303 50601136
0.7073 1.1843 37000 0.7300 50876992
0.8427 1.1907 37200 0.7301 51153296
0.9195 1.1971 37400 0.7300 51427552
0.8146 1.2035 37600 0.7300 51707088
0.682 1.2099 37800 0.7302 51981712
0.6585 1.2163 38000 0.7302 52254352
0.7149 1.2227 38200 0.7300 52529584
0.892 1.2291 38400 0.7299 52803776
0.626 1.2355 38600 0.7300 53078736
0.6822 1.2419 38800 0.7300 53352672
0.5693 1.2483 39000 0.7302 53628768
0.7027 1.2547 39200 0.7301 53905216
0.838 1.2611 39400 0.7300 54178832
0.5893 1.2675 39600 0.7300 54454880
0.818 1.2739 39800 0.7300 54727600
0.6806 1.2803 40000 0.7300 55002224

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_record_1745950246

Adapter
(139)
this model