train_boolq_1745950273

This model is a fine-tuned version of google/gemma-3-1b-it on the boolq dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1538
  • Num Input Tokens Seen: 34633072

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2294 0.0943 200 0.2528 174096
0.4076 0.1886 400 0.2382 344560
0.3531 0.2829 600 0.2216 517536
0.2417 0.3772 800 0.1999 696016
0.1596 0.4715 1000 0.2032 868992
0.2865 0.5658 1200 0.1945 1040544
0.4192 0.6601 1400 0.1821 1211680
0.1268 0.7544 1600 0.1738 1381792
0.172 0.8487 1800 0.1745 1559456
0.3152 0.9430 2000 0.1766 1735840
0.1413 1.0372 2200 0.1739 1910848
0.1775 1.1315 2400 0.2023 2081696
0.1675 1.2258 2600 0.1657 2255952
0.1501 1.3201 2800 0.1832 2427152
0.0747 1.4144 3000 0.1788 2601296
0.0954 1.5087 3200 0.1857 2774672
0.0947 1.6030 3400 0.1752 2944896
0.1546 1.6973 3600 0.1971 3117216
0.2168 1.7916 3800 0.1538 3287952
0.2023 1.8859 4000 0.1737 3464640
0.0789 1.9802 4200 0.1676 3638880
0.0811 2.0745 4400 0.2531 3812624
0.036 2.1688 4600 0.2118 3986544
0.0806 2.2631 4800 0.1946 4158272
0.2564 2.3574 5000 0.2272 4328240
0.0797 2.4517 5200 0.2316 4507760
0.204 2.5460 5400 0.2353 4681664
0.0162 2.6403 5600 0.2062 4856928
0.0736 2.7346 5800 0.2187 5024976
0.1102 2.8289 6000 0.1996 5202368
0.0752 2.9231 6200 0.2483 5377360
0.0575 3.0174 6400 0.2213 5550480
0.0373 3.1117 6600 0.3116 5724080
0.1205 3.2060 6800 0.2736 5896688
0.0456 3.3003 7000 0.2599 6070544
0.0039 3.3946 7200 0.2499 6244624
0.0714 3.4889 7400 0.2836 6416176
0.1576 3.5832 7600 0.2947 6587616
0.1611 3.6775 7800 0.2595 6759696
0.1766 3.7718 8000 0.2564 6932384
0.0044 3.8661 8200 0.2501 7103328
0.0091 3.9604 8400 0.2621 7276304
0.0009 4.0547 8600 0.3677 7448112
0.0241 4.1490 8800 0.3428 7623632
0.0001 4.2433 9000 0.4377 7799248
0.0272 4.3376 9200 0.3371 7974368
0.0004 4.4319 9400 0.4070 8146384
0.0087 4.5262 9600 0.3419 8321456
0.0015 4.6205 9800 0.3727 8490096
0.0006 4.7148 10000 0.3746 8665904
0.0074 4.8091 10200 0.4006 8837712
0.0006 4.9033 10400 0.3620 9010400
0.0738 4.9976 10600 0.3870 9185584
0.0004 5.0919 10800 0.4343 9358160
0.0001 5.1862 11000 0.5670 9535520
0.0 5.2805 11200 0.6110 9709232
0.1252 5.3748 11400 0.6078 9880896
0.0001 5.4691 11600 0.4508 10053056
0.0551 5.5634 11800 0.4355 10229152
0.0003 5.6577 12000 0.4886 10404384
0.0074 5.7520 12200 0.4803 10573872
0.0002 5.8463 12400 0.4446 10748304
0.067 5.9406 12600 0.5511 10917920
0.0001 6.0349 12800 0.5721 11092736
0.0 6.1292 13000 0.5537 11269264
0.0 6.2235 13200 0.5562 11441120
0.0 6.3178 13400 0.7728 11614176
0.0 6.4121 13600 0.5305 11785424
0.0 6.5064 13800 0.6761 11960752
0.0 6.6007 14000 0.5254 12132672
0.0 6.6950 14200 0.5536 12303424
0.0003 6.7893 14400 0.5881 12474592
0.0 6.8835 14600 0.6505 12649424
0.0 6.9778 14800 0.5357 12821280
0.0 7.0721 15000 0.6436 12996208
0.0 7.1664 15200 0.6179 13172592
0.0 7.2607 15400 0.6625 13342864
0.0 7.3550 15600 0.6576 13515600
0.1354 7.4493 15800 0.7310 13688640
0.0002 7.5436 16000 0.5292 13863312
0.0 7.6379 16200 0.6760 14032992
0.0021 7.7322 16400 0.5667 14205936
0.0 7.8265 16600 0.5081 14378336
0.0 7.9208 16800 0.6207 14551456
0.0 8.0151 17000 0.7328 14730672
0.1146 8.1094 17200 0.5845 14904544
0.0 8.2037 17400 0.6697 15078832
0.0 8.2980 17600 0.6067 15254544
0.0001 8.3923 17800 0.5690 15422256
0.0 8.4866 18000 0.7316 15595776
0.0 8.5809 18200 0.6704 15768288
0.1208 8.6752 18400 0.7145 15941776
0.0 8.7694 18600 0.6353 16115152
0.0001 8.8637 18800 0.5448 16284384
0.123 8.9580 19000 0.4909 16457552
0.0001 9.0523 19200 0.4956 16632272
0.0569 9.1466 19400 0.6051 16806304
0.0 9.2409 19600 0.5905 16979072
0.0 9.3352 19800 0.6460 17150160
0.0 9.4295 20000 0.6048 17321280
0.0 9.5238 20200 0.6894 17495488
0.0001 9.6181 20400 0.6447 17670576
0.1459 9.7124 20600 0.6084 17843440
0.0 9.8067 20800 0.5676 18012496
0.0 9.9010 21000 0.6301 18186480
0.0 9.9953 21200 0.7091 18360368
0.0 10.0896 21400 0.7228 18539664
0.0 10.1839 21600 0.5961 18718016
0.0 10.2782 21800 0.6606 18888560
0.0 10.3725 22000 0.6740 19061328
0.0 10.4668 22200 0.6405 19236176
0.0002 10.5611 22400 0.7411 19404288
0.0 10.6554 22600 0.7990 19574224
0.0 10.7496 22800 0.7424 19744496
0.0 10.8439 23000 0.6840 19915984
0.0 10.9382 23200 0.7526 20090944
0.0 11.0325 23400 0.7356 20264992
0.0 11.1268 23600 0.7534 20437952
0.0 11.2211 23800 0.6226 20611040
0.0 11.3154 24000 0.6761 20787488
0.0 11.4097 24200 0.7833 20958240
0.0 11.5040 24400 0.8089 21133392
0.0 11.5983 24600 0.5762 21303360
0.0 11.6926 24800 0.6816 21475184
0.0 11.7869 25000 0.6941 21649744
0.0392 11.8812 25200 0.7063 21819728
0.0 11.9755 25400 0.7234 21993120
0.0 12.0698 25600 0.7154 22164624
0.0001 12.1641 25800 0.6475 22340064
0.0 12.2584 26000 0.5743 22515088
0.0 12.3527 26200 0.6342 22692240
0.0 12.4470 26400 0.6528 22864512
0.0 12.5413 26600 0.6088 23037568
0.0 12.6355 26800 0.6722 23207936
0.0 12.7298 27000 0.6821 23381376
0.0 12.8241 27200 0.7039 23553008
0.0 12.9184 27400 0.7463 23722608
0.0 13.0127 27600 0.7673 23892928
0.0 13.1070 27800 0.7768 24063632
0.0 13.2013 28000 0.7842 24237248
0.0 13.2956 28200 0.7988 24411712
0.0 13.3899 28400 0.8151 24584800
0.0 13.4842 28600 0.8249 24759888
0.0 13.5785 28800 0.8314 24936720
0.0 13.6728 29000 0.8392 25110864
0.0 13.7671 29200 0.8851 25284944
0.0 13.8614 29400 0.8816 25456816
0.0 13.9557 29600 0.9020 25631728
0.0 14.0500 29800 0.9063 25801056
0.0 14.1443 30000 0.9111 25978896
0.0 14.2386 30200 0.9165 26156672
0.0 14.3329 30400 0.9197 26330592
0.0 14.4272 30600 0.9362 26502800
0.0 14.5215 30800 0.9417 26671584
0.0 14.6157 31000 0.9498 26845568
0.0 14.7100 31200 0.9565 27017952
0.0 14.8043 31400 0.9647 27191600
0.0 14.8986 31600 0.9717 27362144
0.0 14.9929 31800 0.9754 27536992
0.0 15.0872 32000 0.9888 27707728
0.0 15.1815 32200 0.9849 27886368
0.0 15.2758 32400 0.9882 28061984
0.0 15.3701 32600 0.9837 28233360
0.0 15.4644 32800 1.0049 28411200
0.0 15.5587 33000 1.0165 28582944
0.0 15.6530 33200 1.0125 28756240
0.0 15.7473 33400 1.0162 28926208
0.0 15.8416 33600 1.0269 29096816
0.0 15.9359 33800 1.0293 29267072
0.0 16.0302 34000 1.0407 29435360
0.0 16.1245 34200 1.0487 29610720
0.0 16.2188 34400 1.0502 29781472
0.0 16.3131 34600 1.0507 29959568
0.0 16.4074 34800 1.0549 30134704
0.0 16.5017 35000 1.0570 30305200
0.0 16.5959 35200 1.0530 30478576
0.0 16.6902 35400 1.0656 30647744
0.0 16.7845 35600 1.0645 30823072
0.0 16.8788 35800 1.0705 30996032
0.0 16.9731 36000 1.0821 31167328
0.0 17.0674 36200 1.0706 31341392
0.0 17.1617 36400 1.0773 31515648
0.0 17.2560 36600 1.0880 31690208
0.0 17.3503 36800 1.0854 31868288
0.0 17.4446 37000 1.0881 32041536
0.0 17.5389 37200 1.0959 32213584
0.0 17.6332 37400 1.0957 32385920
0.0 17.7275 37600 1.1022 32555856
0.0 17.8218 37800 1.0966 32729024
0.0 17.9161 38000 1.0943 32902832
0.0 18.0104 38200 1.1011 33076912
0.0 18.1047 38400 1.1012 33248832
0.0 18.1990 38600 1.1079 33420800
0.0 18.2933 38800 1.1023 33594000
0.0 18.3876 39000 1.1009 33765936
0.0 18.4818 39200 1.1061 33936896
0.0 18.5761 39400 1.1135 34110592
0.0 18.6704 39600 1.1081 34284208
0.0 18.7647 39800 1.1088 34458576
0.0 18.8590 40000 1.1102 34633072

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_boolq_1745950273

Adapter
(95)
this model

Dataset used to train rbelanec/train_boolq_1745950273