colloquial-en-ta-model

This model is a fine-tuned version of unsloth/tinyllama-chat-bnb-4bit on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 4.6425

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss
0.186 0.0833 2 3.8667
0.2332 0.1667 4 3.9530
0.1958 0.25 6 3.9433
0.2574 0.3333 8 3.9266
0.2392 0.4167 10 4.0018
0.267 0.5 12 4.1073
0.2365 0.5833 14 4.1133
0.2297 0.6667 16 4.1735
0.2939 0.75 18 4.1250
0.2839 0.8333 20 3.9915
0.2321 0.9167 22 3.8862
0.2593 1.0 24 3.5976
0.2443 1.0833 26 3.5034
0.2446 1.1667 28 3.6098
0.2025 1.25 30 3.6458
0.2096 1.3333 32 3.8463
0.1989 1.4167 34 3.9578
0.2417 1.5 36 4.0063
0.1932 1.5833 38 3.9616
0.2697 1.6667 40 3.9236
0.3232 1.75 42 4.0916
0.2395 1.8333 44 4.4588
0.2635 1.9167 46 4.6681
0.2209 2.0 48 4.8236
0.2299 2.0833 50 4.8193
0.2216 2.1667 52 4.7359
0.246 2.25 54 4.6201
0.2061 2.3333 56 4.4043
0.2607 2.4167 58 4.3860
0.2999 2.5 60 4.4531
0.1862 2.5833 62 4.4239
0.2172 2.6667 64 4.2959
0.2011 2.75 66 4.1646
0.3363 2.8333 68 4.0998
0.1977 2.9167 70 4.0342
0.2381 3.0 72 4.0860
0.2532 3.0833 74 4.1734
0.1982 3.1667 76 4.1134
0.232 3.25 78 4.0263
0.2639 3.3333 80 4.0635
0.2489 3.4167 82 4.1314
0.2761 3.5 84 4.1371
0.2145 3.5833 86 4.0581
0.2398 3.6667 88 3.9594
0.2952 3.75 90 3.9724
0.1801 3.8333 92 3.9485
0.2985 3.9167 94 3.9065
0.2383 4.0 96 3.9725
0.2271 4.0833 98 3.9096
0.2528 4.1667 100 3.9251
0.2173 4.25 102 3.9609
0.2495 4.3333 104 3.9952
0.2278 4.4167 106 4.0805
0.2137 4.5 108 4.1261
0.2475 4.5833 110 4.1534
0.2416 4.6667 112 4.2162
0.2242 4.75 114 4.2275
0.22 4.8333 116 4.1912
0.2289 4.9167 118 4.1684
0.3094 5.0 120 4.2682
0.2152 5.0833 122 4.2967
0.262 5.1667 124 4.3709
0.2207 5.25 126 4.4091
0.1992 5.3333 128 4.4117
0.1872 5.4167 130 4.4086
0.2242 5.5 132 4.3438
0.232 5.5833 134 4.3030
0.2397 5.6667 136 4.3257
0.2319 5.75 138 4.3530
0.2234 5.8333 140 4.3907
0.2388 5.9167 142 4.4481
0.2488 6.0 144 4.4957
0.207 6.0833 146 4.4889
0.2375 6.1667 148 4.4046
0.2502 6.25 150 4.3721
0.2074 6.3333 152 4.2997
0.1813 6.4167 154 4.2063
0.1741 6.5 156 4.1596
0.2411 6.5833 158 4.1992
0.2215 6.6667 160 4.2852
0.2043 6.75 162 4.3838
0.2214 6.8333 164 4.5184
0.2556 6.9167 166 4.6443
0.2216 7.0 168 4.7537
0.2029 7.0833 170 4.7834
0.1975 7.1667 172 4.7177
0.2277 7.25 174 4.6574
0.2244 7.3333 176 4.5969
0.1995 7.4167 178 4.5469
0.206 7.5 180 4.5155
0.2458 7.5833 182 4.4985
0.2347 7.6667 184 4.4947
0.2377 7.75 186 4.5014
0.2178 7.8333 188 4.5097
0.2361 7.9167 190 4.4964
0.2231 8.0 192 4.5026
0.1831 8.0833 194 4.5176
0.1926 8.1667 196 4.5404
0.2053 8.25 198 4.5320
0.219 8.3333 200 4.5144
0.1942 8.4167 202 4.5047
0.2137 8.5 204 4.5080
0.2544 8.5833 206 4.5164
0.2233 8.6667 208 4.5326
0.1995 8.75 210 4.5626
0.2327 8.8333 212 4.5890
0.2567 8.9167 214 4.6160
0.2331 9.0 216 4.6435
0.2036 9.0833 218 4.6524
0.2416 9.1667 220 4.6509
0.2579 9.25 222 4.6513
0.2302 9.3333 224 4.6485
0.2289 9.4167 226 4.6461
0.2008 9.5 228 4.6489
0.2198 9.5833 230 4.6516
0.2416 9.6667 232 4.6467
0.1906 9.75 234 4.6435
0.2075 9.8333 236 4.6405
0.2106 9.9167 238 4.6406
0.1772 10.0 240 4.6425

Framework versions

  • PEFT 0.14.0
  • Transformers 4.48.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.1
  • Tokenizers 0.21.0
Downloads last month
11
Safetensors
Model size
631M params
Tensor type
F32
FP16
U8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for Renuga07/colloquial-en-ta-model

Adapter
(112)
this model