train_cb_1745950311

This model is a fine-tuned version of google/gemma-3-1b-it on the cb dataset. It achieves the following results on the evaluation set:

  • Loss: 3.1655
  • Num Input Tokens Seen: 22718312

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
4.0807 3.5133 200 3.3497 114504
3.6519 7.0177 400 3.2679 228504
3.6797 10.5310 600 3.2499 341136
4.041 14.0354 800 3.2563 455488
3.7512 17.5487 1000 3.2101 569504
3.8353 21.0531 1200 3.2585 682024
2.9458 24.5664 1400 3.2220 796328
3.3649 28.0708 1600 3.2215 909320
2.9958 31.5841 1800 3.2204 1023696
3.6459 35.0885 2000 3.2341 1137280
3.407 38.6018 2200 3.2116 1251592
3.3997 42.1062 2400 3.2450 1364312
3.4376 45.6195 2600 3.2132 1478704
3.1712 49.1239 2800 3.2435 1591424
2.8129 52.6372 3000 3.2367 1705000
3.2112 56.1416 3200 3.2195 1818688
3.444 59.6549 3400 3.2276 1932248
3.3753 63.1593 3600 3.2073 2045464
3.5961 66.6726 3800 3.2277 2159128
3.4198 70.1770 4000 3.2111 2272792
3.5456 73.6903 4200 3.2131 2387344
3.4667 77.1947 4400 3.2211 2500160
3.4217 80.7080 4600 3.2103 2614032
3.018 84.2124 4800 3.2135 2728488
3.7827 87.7257 5000 3.1999 2842656
3.0636 91.2301 5200 3.2040 2956824
3.1925 94.7434 5400 3.1948 3069840
3.7793 98.2478 5600 3.2220 3183600
3.3979 101.7611 5800 3.2101 3297896
3.2207 105.2655 6000 3.2225 3411544
3.2364 108.7788 6200 3.2056 3525472
2.9953 112.2832 6400 3.2090 3638584
3.7652 115.7965 6600 3.1998 3752608
3.6473 119.3009 6800 3.2282 3865376
3.644 122.8142 7000 3.2166 3979464
3.625 126.3186 7200 3.1989 4093296
3.6067 129.8319 7400 3.2003 4207120
3.9015 133.3363 7600 3.1655 4320568
2.7832 136.8496 7800 3.1700 4434056
3.2049 140.3540 8000 3.2194 4547840
3.7696 143.8673 8200 3.1741 4662192
3.4407 147.3717 8400 3.2007 4774160
3.5295 150.8850 8600 3.1881 4887640
3.5294 154.3894 8800 3.2013 5002864
3.665 157.9027 9000 3.2144 5116216
3.6292 161.4071 9200 3.2175 5229496
3.5929 164.9204 9400 3.2193 5343528
3.5141 168.4248 9600 3.2143 5455520
3.3751 171.9381 9800 3.1960 5571144
3.1501 175.4425 10000 3.2038 5684752
3.7354 178.9558 10200 3.2202 5799088
3.4348 182.4602 10400 3.2499 5911888
3.5634 185.9735 10600 3.2436 6025544
3.0645 189.4779 10800 3.2517 6139264
3.4348 192.9912 11000 3.2544 6252832
3.0555 196.4956 11200 3.2208 6366440
3.0587 200.0 11400 3.2256 6478776
3.3529 203.5133 11600 3.2029 6592280
3.4065 207.0177 11800 3.1870 6704968
3.4619 210.5310 12000 3.1783 6819568
4.1823 214.0354 12200 3.2032 6933264
3.0796 217.5487 12400 3.2419 7045688
2.8819 221.0531 12600 3.2527 7159888
2.9998 224.5664 12800 3.2493 7274296
3.3352 228.0708 13000 3.2510 7387544
3.158 231.5841 13200 3.2489 7500200
3.5065 235.0885 13400 3.2504 7614696
3.6667 238.6018 13600 3.2542 7727608
3.4226 242.1062 13800 3.2458 7840696
2.9556 245.6195 14000 3.2465 7954632
3.7354 249.1239 14200 3.2517 8068648
3.1403 252.6372 14400 3.2444 8181840
3.1824 256.1416 14600 3.2362 8294896
3.3174 259.6549 14800 3.2404 8408512
3.6373 263.1593 15000 3.2357 8522664
3.7092 266.6726 15200 3.2490 8636032
3.6183 270.1770 15400 3.2349 8748624
3.4334 273.6903 15600 3.2301 8863248
2.9108 277.1947 15800 3.2369 8976424
2.8284 280.7080 16000 3.2299 9088984
3.8347 284.2124 16200 3.2340 9204128
3.3208 287.7257 16400 3.2343 9317208
3.6733 291.2301 16600 3.2311 9431208
3.2708 294.7434 16800 3.2320 9544328
3.1661 298.2478 17000 3.2283 9657432
3.1148 301.7611 17200 3.2321 9770824
3.6032 305.2655 17400 3.2318 9884648
3.3582 308.7788 17600 3.2327 9997288
3.0323 312.2832 17800 3.2300 10111472
3.4143 315.7965 18000 3.2333 10223648
3.5111 319.3009 18200 3.2321 10336864
3.3092 322.8142 18400 3.2333 10450688
3.2414 326.3186 18600 3.2352 10563128
3.2062 329.8319 18800 3.2300 10677928
3.3109 333.3363 19000 3.2333 10790896
3.2425 336.8496 19200 3.2343 10904600
3.4983 340.3540 19400 3.2315 11018112
3.2053 343.8673 19600 3.2337 11131712
3.1086 347.3717 19800 3.2389 11245728
2.902 350.8850 20000 3.2306 11358800
3.1164 354.3894 20200 3.2343 11471832
3.4089 357.9027 20400 3.2304 11586368
3.2023 361.4071 20600 3.2315 11700176
3.241 364.9204 20800 3.2343 11814304
3.3168 368.4248 21000 3.2339 11927464
3.744 371.9381 21200 3.2389 12041416
2.8431 375.4425 21400 3.2334 12153176
3.4538 378.9558 21600 3.2341 12267984
3.5769 382.4602 21800 3.2357 12381424
3.1883 385.9735 22000 3.2388 12494280
3.6814 389.4779 22200 3.2323 12608008
3.6376 392.9912 22400 3.2386 12721456
3.4212 396.4956 22600 3.2357 12835240
2.8052 400.0 22800 3.2334 12948416
3.4607 403.5133 23000 3.2386 13061472
3.3598 407.0177 23200 3.2392 13175888
3.2603 410.5310 23400 3.2406 13289752
2.9499 414.0354 23600 3.2435 13403848
2.953 417.5487 23800 3.2435 13518496
3.2406 421.0531 24000 3.2398 13631704
3.7591 424.5664 24200 3.2323 13745200
2.904 428.0708 24400 3.2364 13859752
3.7685 431.5841 24600 3.2348 13972648
3.5038 435.0885 24800 3.2348 14086360
4.0004 438.6018 25000 3.2364 14201656
3.6786 442.1062 25200 3.2364 14314736
3.1436 445.6195 25400 3.2364 14428104
3.531 449.1239 25600 3.2323 14541136
3.2502 452.6372 25800 3.2396 14655696
3.1684 456.1416 26000 3.2403 14768168
3.0653 459.6549 26200 3.2403 14882048
3.1468 463.1593 26400 3.2401 14996008
3.7712 466.6726 26600 3.2401 15109352
3.3503 470.1770 26800 3.2396 15223592
3.4174 473.6903 27000 3.2396 15338072
2.9202 477.1947 27200 3.2396 15451312
3.2351 480.7080 27400 3.2396 15565784
3.1379 484.2124 27600 3.2396 15679720
3.2297 487.7257 27800 3.2396 15792680
3.3102 491.2301 28000 3.2396 15906624
3.4002 494.7434 28200 3.2396 16019936
3.6065 498.2478 28400 3.2396 16133784
3.3413 501.7611 28600 3.2396 16248200
3.844 505.2655 28800 3.2396 16361560
3.9788 508.7788 29000 3.2396 16475624
2.6839 512.2832 29200 3.2396 16588984
3.5726 515.7965 29400 3.2396 16702496
3.3767 519.3009 29600 3.2396 16816272
3.3213 522.8142 29800 3.2396 16929072
3.0932 526.3186 30000 3.2396 17043120
3.3793 529.8319 30200 3.2396 17156344
3.1767 533.3363 30400 3.2396 17268656
3.8454 536.8496 30600 3.2396 17383696
3.3684 540.3540 30800 3.2396 17495648
3.0775 543.8673 31000 3.2396 17609616
3.4404 547.3717 31200 3.2396 17723600
3.3418 550.8850 31400 3.2396 17836576
3.1698 554.3894 31600 3.2396 17949928
3.2589 557.9027 31800 3.2396 18064576
3.15 561.4071 32000 3.2396 18177096
3.204 564.9204 32200 3.2396 18290608
3.981 568.4248 32400 3.2396 18404648
3.1365 571.9381 32600 3.2396 18517216
2.9581 575.4425 32800 3.2396 18631296
3.2565 578.9558 33000 3.2396 18745416
3.4609 582.4602 33200 3.2396 18857896
3.1193 585.9735 33400 3.2396 18971344
3.4526 589.4779 33600 3.2396 19085248
3.4306 592.9912 33800 3.2396 19199136
3.0964 596.4956 34000 3.2396 19311344
3.3854 600.0 34200 3.2396 19425472
2.7999 603.5133 34400 3.2396 19539112
2.9199 607.0177 34600 3.2396 19652392
3.5917 610.5310 34800 3.2396 19766904
3.1552 614.0354 35000 3.2396 19879808
3.3945 617.5487 35200 3.2396 19993952
3.3882 621.0531 35400 3.2396 20107560
3.8557 624.5664 35600 3.2396 20220888
3.0355 628.0708 35800 3.2396 20333904
3.3379 631.5841 36000 3.2396 20446736
3.8296 635.0885 36200 3.2396 20560472
3.5599 638.6018 36400 3.2396 20673984
3.7631 642.1062 36600 3.2396 20786240
3.4259 645.6195 36800 3.2396 20899128
3.2107 649.1239 37000 3.2396 21011928
3.1443 652.6372 37200 3.2396 21126880
2.8292 656.1416 37400 3.2396 21239760
3.1505 659.6549 37600 3.2396 21353776
3.6223 663.1593 37800 3.2396 21467368
3.1295 666.6726 38000 3.2396 21581512
2.9856 670.1770 38200 3.2396 21694376
3.003 673.6903 38400 3.2396 21808568
3.7949 677.1947 38600 3.2396 21922424
3.1024 680.7080 38800 3.2396 22036600
3.8314 684.2124 39000 3.2396 22150992
3.1725 687.7257 39200 3.2396 22263616
3.094 691.2301 39400 3.2396 22377936
3.2364 694.7434 39600 3.2396 22490328
3.1658 698.2478 39800 3.2396 22604096
3.1903 701.7611 40000 3.2396 22718312

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cb_1745950311

Adapter
(95)
this model

Dataset used to train rbelanec/train_cb_1745950311