File size: 16,346 Bytes
aff9cb1
 
 
 
 
 
6325937
aff9cb1
 
 
 
 
 
 
 
 
 
 
 
 
6325937
aff9cb1
6325937
aff9cb1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
---
library_name: peft
license: gemma
base_model: google/gemma-3-1b-it
tags:
- llama-factory
- lntuning
- generated_from_trainer
datasets:
- super_glue
model-index:
- name: train_boolq_1745950274
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# train_boolq_1745950274

This model is a fine-tuned version of [google/gemma-3-1b-it](https://huggingface.co/google/gemma-3-1b-it) on the boolq dataset.
It achieves the following results on the evaluation set:
- Loss: 2.9271
- Num Input Tokens Seen: 34633072

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 123
- gradient_accumulation_steps: 2
- total_train_batch_size: 4
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- training_steps: 40000

### Training results

| Training Loss | Epoch   | Step  | Validation Loss | Input Tokens Seen |
|:-------------:|:-------:|:-----:|:---------------:|:-----------------:|
| 2.6453        | 0.0943  | 200   | 3.2031          | 174096            |
| 3.5809        | 0.1886  | 400   | 3.1503          | 344560            |
| 3.5279        | 0.2829  | 600   | 3.1043          | 517536            |
| 2.8551        | 0.3772  | 800   | 3.0762          | 696016            |
| 2.7868        | 0.4715  | 1000  | 3.0509          | 868992            |
| 2.6766        | 0.5658  | 1200  | 3.0614          | 1040544           |
| 3.84          | 0.6601  | 1400  | 3.0431          | 1211680           |
| 2.3861        | 0.7544  | 1600  | 3.0519          | 1381792           |
| 3.4698        | 0.8487  | 1800  | 3.0226          | 1559456           |
| 2.9566        | 0.9430  | 2000  | 3.0275          | 1735840           |
| 2.8968        | 1.0372  | 2200  | 3.0316          | 1910848           |
| 3.1856        | 1.1315  | 2400  | 3.0049          | 2081696           |
| 3.4134        | 1.2258  | 2600  | 3.0010          | 2255952           |
| 2.9332        | 1.3201  | 2800  | 3.0122          | 2427152           |
| 2.9351        | 1.4144  | 3000  | 3.0129          | 2601296           |
| 2.5434        | 1.5087  | 3200  | 2.9970          | 2774672           |
| 3.1122        | 1.6030  | 3400  | 2.9784          | 2944896           |
| 2.7179        | 1.6973  | 3600  | 2.9600          | 3117216           |
| 2.8625        | 1.7916  | 3800  | 2.9709          | 3287952           |
| 2.7659        | 1.8859  | 4000  | 2.9735          | 3464640           |
| 2.9328        | 1.9802  | 4200  | 2.9566          | 3638880           |
| 3.0105        | 2.0745  | 4400  | 2.9613          | 3812624           |
| 3.4794        | 2.1688  | 4600  | 2.9646          | 3986544           |
| 2.7816        | 2.2631  | 4800  | 2.9783          | 4158272           |
| 3.1668        | 2.3574  | 5000  | 2.9668          | 4328240           |
| 3.0955        | 2.4517  | 5200  | 2.9730          | 4507760           |
| 2.8676        | 2.5460  | 5400  | 2.9506          | 4681664           |
| 3.3803        | 2.6403  | 5600  | 2.9642          | 4856928           |
| 3.1192        | 2.7346  | 5800  | 2.9736          | 5024976           |
| 3.3355        | 2.8289  | 6000  | 2.9883          | 5202368           |
| 3.3114        | 2.9231  | 6200  | 2.9882          | 5377360           |
| 2.1893        | 3.0174  | 6400  | 2.9834          | 5550480           |
| 3.533         | 3.1117  | 6600  | 2.9786          | 5724080           |
| 2.7837        | 3.2060  | 6800  | 2.9849          | 5896688           |
| 3.127         | 3.3003  | 7000  | 2.9557          | 6070544           |
| 2.7459        | 3.3946  | 7200  | 2.9538          | 6244624           |
| 3.2816        | 3.4889  | 7400  | 2.9563          | 6416176           |
| 3.3454        | 3.5832  | 7600  | 2.9709          | 6587616           |
| 2.7435        | 3.6775  | 7800  | 2.9540          | 6759696           |
| 3.1908        | 3.7718  | 8000  | 2.9497          | 6932384           |
| 2.6196        | 3.8661  | 8200  | 2.9479          | 7103328           |
| 3.1352        | 3.9604  | 8400  | 2.9728          | 7276304           |
| 2.9739        | 4.0547  | 8600  | 2.9485          | 7448112           |
| 2.5534        | 4.1490  | 8800  | 2.9555          | 7623632           |
| 2.5543        | 4.2433  | 9000  | 2.9271          | 7799248           |
| 2.5284        | 4.3376  | 9200  | 2.9455          | 7974368           |
| 2.8538        | 4.4319  | 9400  | 2.9379          | 8146384           |
| 2.7943        | 4.5262  | 9600  | 2.9336          | 8321456           |
| 2.4081        | 4.6205  | 9800  | 2.9336          | 8490096           |
| 2.7996        | 4.7148  | 10000 | 2.9381          | 8665904           |
| 2.3661        | 4.8091  | 10200 | 2.9455          | 8837712           |
| 2.966         | 4.9033  | 10400 | 2.9509          | 9010400           |
| 3.1711        | 4.9976  | 10600 | 2.9699          | 9185584           |
| 3.311         | 5.0919  | 10800 | 2.9491          | 9358160           |
| 2.4932        | 5.1862  | 11000 | 2.9626          | 9535520           |
| 3.3139        | 5.2805  | 11200 | 2.9806          | 9709232           |
| 2.7371        | 5.3748  | 11400 | 2.9771          | 9880896           |
| 2.9514        | 5.4691  | 11600 | 2.9830          | 10053056          |
| 3.3724        | 5.5634  | 11800 | 2.9859          | 10229152          |
| 3.1756        | 5.6577  | 12000 | 2.9848          | 10404384          |
| 2.9306        | 5.7520  | 12200 | 2.9874          | 10573872          |
| 2.9906        | 5.8463  | 12400 | 2.9814          | 10748304          |
| 3.0223        | 5.9406  | 12600 | 2.9898          | 10917920          |
| 3.4233        | 6.0349  | 12800 | 2.9858          | 11092736          |
| 2.554         | 6.1292  | 13000 | 2.9839          | 11269264          |
| 3.5816        | 6.2235  | 13200 | 2.9827          | 11441120          |
| 2.9904        | 6.3178  | 13400 | 2.9840          | 11614176          |
| 2.3922        | 6.4121  | 13600 | 2.9881          | 11785424          |
| 3.0193        | 6.5064  | 13800 | 2.9901          | 11960752          |
| 2.663         | 6.6007  | 14000 | 2.9775          | 12132672          |
| 3.3592        | 6.6950  | 14200 | 2.9821          | 12303424          |
| 3.1617        | 6.7893  | 14400 | 2.9830          | 12474592          |
| 3.1247        | 6.8835  | 14600 | 2.9735          | 12649424          |
| 2.4094        | 6.9778  | 14800 | 2.9854          | 12821280          |
| 2.8975        | 7.0721  | 15000 | 2.9798          | 12996208          |
| 3.3305        | 7.1664  | 15200 | 2.9779          | 13172592          |
| 2.8335        | 7.2607  | 15400 | 2.9754          | 13342864          |
| 3.2162        | 7.3550  | 15600 | 2.9741          | 13515600          |
| 3.1557        | 7.4493  | 15800 | 2.9858          | 13688640          |
| 3.123         | 7.5436  | 16000 | 2.9802          | 13863312          |
| 3.1461        | 7.6379  | 16200 | 2.9744          | 14032992          |
| 2.5753        | 7.7322  | 16400 | 2.9744          | 14205936          |
| 3.0835        | 7.8265  | 16600 | 2.9788          | 14378336          |
| 2.9754        | 7.9208  | 16800 | 2.9861          | 14551456          |
| 2.7244        | 8.0151  | 17000 | 2.9781          | 14730672          |
| 3.4109        | 8.1094  | 17200 | 2.9844          | 14904544          |
| 2.7873        | 8.2037  | 17400 | 2.9804          | 15078832          |
| 3.263         | 8.2980  | 17600 | 2.9827          | 15254544          |
| 2.6633        | 8.3923  | 17800 | 2.9717          | 15422256          |
| 2.6194        | 8.4866  | 18000 | 2.9880          | 15595776          |
| 2.8025        | 8.5809  | 18200 | 2.9845          | 15768288          |
| 3.2739        | 8.6752  | 18400 | 2.9862          | 15941776          |
| 3.0337        | 8.7694  | 18600 | 2.9897          | 16115152          |
| 3.0608        | 8.8637  | 18800 | 2.9865          | 16284384          |
| 3.5312        | 8.9580  | 19000 | 2.9885          | 16457552          |
| 2.6771        | 9.0523  | 19200 | 2.9896          | 16632272          |
| 2.8448        | 9.1466  | 19400 | 2.9859          | 16806304          |
| 3.4979        | 9.2409  | 19600 | 2.9860          | 16979072          |
| 3.4671        | 9.3352  | 19800 | 2.9841          | 17150160          |
| 3.6682        | 9.4295  | 20000 | 2.9850          | 17321280          |
| 2.8798        | 9.5238  | 20200 | 2.9839          | 17495488          |
| 3.5262        | 9.6181  | 20400 | 2.9818          | 17670576          |
| 3.3741        | 9.7124  | 20600 | 2.9857          | 17843440          |
| 2.8687        | 9.8067  | 20800 | 2.9815          | 18012496          |
| 2.7849        | 9.9010  | 21000 | 2.9824          | 18186480          |
| 2.7368        | 9.9953  | 21200 | 2.9828          | 18360368          |
| 2.6571        | 10.0896 | 21400 | 2.9818          | 18539664          |
| 2.6093        | 10.1839 | 21600 | 2.9837          | 18718016          |
| 2.8979        | 10.2782 | 21800 | 2.9838          | 18888560          |
| 2.3822        | 10.3725 | 22000 | 2.9835          | 19061328          |
| 2.8941        | 10.4668 | 22200 | 2.9847          | 19236176          |
| 2.2785        | 10.5611 | 22400 | 2.9793          | 19404288          |
| 2.7086        | 10.6554 | 22600 | 2.9829          | 19574224          |
| 3.0499        | 10.7496 | 22800 | 2.9829          | 19744496          |
| 2.5357        | 10.8439 | 23000 | 2.9834          | 19915984          |
| 2.8058        | 10.9382 | 23200 | 2.9827          | 20090944          |
| 3.2345        | 11.0325 | 23400 | 2.9843          | 20264992          |
| 2.6316        | 11.1268 | 23600 | 2.9810          | 20437952          |
| 3.344         | 11.2211 | 23800 | 2.9823          | 20611040          |
| 3.0959        | 11.3154 | 24000 | 2.9831          | 20787488          |
| 3.3262        | 11.4097 | 24200 | 2.9829          | 20958240          |
| 3.9468        | 11.5040 | 24400 | 2.9828          | 21133392          |
| 2.874         | 11.5983 | 24600 | 2.9810          | 21303360          |
| 2.9608        | 11.6926 | 24800 | 2.9846          | 21475184          |
| 2.9467        | 11.7869 | 25000 | 2.9840          | 21649744          |
| 2.8529        | 11.8812 | 25200 | 2.9841          | 21819728          |
| 3.0579        | 11.9755 | 25400 | 2.9836          | 21993120          |
| 2.9273        | 12.0698 | 25600 | 2.9827          | 22164624          |
| 3.3136        | 12.1641 | 25800 | 2.9837          | 22340064          |
| 2.507         | 12.2584 | 26000 | 2.9838          | 22515088          |
| 2.7376        | 12.3527 | 26200 | 2.9842          | 22692240          |
| 2.3293        | 12.4470 | 26400 | 2.9816          | 22864512          |
| 3.2821        | 12.5413 | 26600 | 2.9816          | 23037568          |
| 2.8383        | 12.6355 | 26800 | 2.9818          | 23207936          |
| 2.491         | 12.7298 | 27000 | 2.9850          | 23381376          |
| 2.7425        | 12.8241 | 27200 | 2.9821          | 23553008          |
| 3.0866        | 12.9184 | 27400 | 2.9818          | 23722608          |
| 3.0738        | 13.0127 | 27600 | 2.9836          | 23892928          |
| 2.6363        | 13.1070 | 27800 | 2.9819          | 24063632          |
| 3.15          | 13.2013 | 28000 | 2.9816          | 24237248          |
| 2.9501        | 13.2956 | 28200 | 2.9806          | 24411712          |
| 3.1561        | 13.3899 | 28400 | 2.9818          | 24584800          |
| 3.0268        | 13.4842 | 28600 | 2.9812          | 24759888          |
| 2.6915        | 13.5785 | 28800 | 2.9827          | 24936720          |
| 2.1768        | 13.6728 | 29000 | 2.9816          | 25110864          |
| 2.769         | 13.7671 | 29200 | 2.9800          | 25284944          |
| 3.4771        | 13.8614 | 29400 | 2.9812          | 25456816          |
| 3.0152        | 13.9557 | 29600 | 2.9807          | 25631728          |
| 2.8072        | 14.0500 | 29800 | 2.9815          | 25801056          |
| 3.4366        | 14.1443 | 30000 | 2.9832          | 25978896          |
| 3.0998        | 14.2386 | 30200 | 2.9835          | 26156672          |
| 2.6254        | 14.3329 | 30400 | 2.9815          | 26330592          |
| 3.1786        | 14.4272 | 30600 | 2.9817          | 26502800          |
| 2.803         | 14.5215 | 30800 | 2.9820          | 26671584          |
| 2.881         | 14.6157 | 31000 | 2.9818          | 26845568          |
| 3.4202        | 14.7100 | 31200 | 2.9797          | 27017952          |
| 2.9903        | 14.8043 | 31400 | 2.9816          | 27191600          |
| 2.9173        | 14.8986 | 31600 | 2.9821          | 27362144          |
| 2.5902        | 14.9929 | 31800 | 2.9826          | 27536992          |
| 3.4131        | 15.0872 | 32000 | 2.9853          | 27707728          |
| 3.1571        | 15.1815 | 32200 | 2.9846          | 27886368          |
| 4.0238        | 15.2758 | 32400 | 2.9850          | 28061984          |
| 2.7433        | 15.3701 | 32600 | 2.9855          | 28233360          |
| 2.9622        | 15.4644 | 32800 | 2.9850          | 28411200          |
| 2.9855        | 15.5587 | 33000 | 2.9855          | 28582944          |
| 2.8146        | 15.6530 | 33200 | 2.9850          | 28756240          |
| 2.1957        | 15.7473 | 33400 | 2.9850          | 28926208          |
| 3.0656        | 15.8416 | 33600 | 2.9850          | 29096816          |
| 2.781         | 15.9359 | 33800 | 2.9844          | 29267072          |
| 2.8638        | 16.0302 | 34000 | 2.9844          | 29435360          |
| 3.3154        | 16.1245 | 34200 | 2.9844          | 29610720          |
| 2.351         | 16.2188 | 34400 | 2.9844          | 29781472          |
| 2.8905        | 16.3131 | 34600 | 2.9844          | 29959568          |
| 2.4058        | 16.4074 | 34800 | 2.9844          | 30134704          |
| 3.0833        | 16.5017 | 35000 | 2.9844          | 30305200          |
| 3.0284        | 16.5959 | 35200 | 2.9844          | 30478576          |
| 2.7625        | 16.6902 | 35400 | 2.9844          | 30647744          |
| 2.5376        | 16.7845 | 35600 | 2.9844          | 30823072          |
| 3.3257        | 16.8788 | 35800 | 2.9844          | 30996032          |
| 2.8614        | 16.9731 | 36000 | 2.9844          | 31167328          |
| 2.8231        | 17.0674 | 36200 | 2.9844          | 31341392          |
| 2.4818        | 17.1617 | 36400 | 2.9844          | 31515648          |
| 3.5857        | 17.2560 | 36600 | 2.9844          | 31690208          |
| 2.8301        | 17.3503 | 36800 | 2.9844          | 31868288          |
| 2.6124        | 17.4446 | 37000 | 2.9844          | 32041536          |
| 3.5769        | 17.5389 | 37200 | 2.9844          | 32213584          |
| 3.0379        | 17.6332 | 37400 | 2.9844          | 32385920          |
| 2.76          | 17.7275 | 37600 | 2.9844          | 32555856          |
| 3.1422        | 17.8218 | 37800 | 2.9844          | 32729024          |
| 2.7946        | 17.9161 | 38000 | 2.9844          | 32902832          |
| 3.2047        | 18.0104 | 38200 | 2.9844          | 33076912          |
| 2.4714        | 18.1047 | 38400 | 2.9844          | 33248832          |
| 2.6947        | 18.1990 | 38600 | 2.9844          | 33420800          |
| 3.2644        | 18.2933 | 38800 | 2.9844          | 33594000          |
| 2.9905        | 18.3876 | 39000 | 2.9844          | 33765936          |
| 2.7909        | 18.4818 | 39200 | 2.9844          | 33936896          |
| 2.7959        | 18.5761 | 39400 | 2.9844          | 34110592          |
| 2.8925        | 18.6704 | 39600 | 2.9844          | 34284208          |
| 3.0191        | 18.7647 | 39800 | 2.9844          | 34458576          |
| 3.5334        | 18.8590 | 40000 | 2.9844          | 34633072          |


### Framework versions

- PEFT 0.15.2.dev0
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1