You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

tamil-colloquial-english-translate-model

This model is a fine-tuned version of unsloth/tinyllama-chat-bnb-4bit on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.2479

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 2
  • num_epochs: 5
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
8.6422 0.0265 2 6.0298
6.9944 0.0530 4 5.1665
6.7864 0.0795 6 5.1665
3.7423 0.1060 8 3.3416
2.7338 0.1325 10 2.7392
2.9393 0.1589 12 3.0218
2.2037 0.1854 14 2.7683
2.0967 0.2119 16 3.0727
1.9914 0.2384 18 3.4467
2.0141 0.2649 20 5.4803
2.4236 0.2914 22 7.4676
3.463 0.3179 24 7.5226
2.1304 0.3444 26 7.1676
2.4984 0.3709 28 6.9342
1.6678 0.3974 30 6.9310
2.3985 0.4238 32 6.4441
1.794 0.4503 34 5.7529
2.4018 0.4768 36 5.2054
2.4566 0.5033 38 4.9671
2.2224 0.5298 40 4.5316
1.94 0.5563 42 3.9569
1.9299 0.5828 44 3.7401
1.719 0.6093 46 3.6166
1.7334 0.6358 48 3.0501
1.6715 0.6623 50 2.5901
1.3979 0.6887 52 2.6318
1.9087 0.7152 54 2.9039
1.8333 0.7417 56 2.8776
1.6035 0.7682 58 2.7774
1.41 0.7947 60 2.5044
1.9075 0.8212 62 2.8705
1.608 0.8477 64 2.7610
1.7068 0.8742 66 2.9243
1.6267 0.9007 68 2.3522
1.4378 0.9272 70 2.6712
1.8967 0.9536 72 3.7529
1.4106 0.9801 74 2.9436
2.0129 1.0 76 3.1104
1.2537 1.0265 78 2.7373
1.5516 1.0530 80 2.4722
1.4263 1.0795 82 2.1472
0.9644 1.1060 84 2.3321
1.679 1.1325 86 2.9275
1.2739 1.1589 88 2.1865
1.4228 1.1854 90 2.0449
1.3859 1.2119 92 2.6632
1.6259 1.2384 94 2.6249
1.5091 1.2649 96 2.1462
1.6238 1.2914 98 2.8612
1.4244 1.3179 100 2.8897
1.6451 1.3444 102 2.2974
1.5182 1.3709 104 3.0305
1.1502 1.3974 106 2.7777
1.3721 1.4238 108 2.0768
1.8245 1.4503 110 2.5860
1.222 1.4768 112 2.6979
1.6133 1.5033 114 2.3744
1.2356 1.5298 116 2.5420
1.4606 1.5563 118 2.2398
1.3163 1.5828 120 2.4619
1.3804 1.6093 122 2.2209
1.4569 1.6358 124 2.8278
1.1365 1.6623 126 2.5291
1.5134 1.6887 128 2.5234
1.2794 1.7152 130 2.7923
1.1179 1.7417 132 2.2813
1.5328 1.7682 134 2.4505
1.4426 1.7947 136 3.1080
1.494 1.8212 138 2.6538
1.3861 1.8477 140 2.5577
1.3619 1.8742 142 2.7934
1.1387 1.9007 144 2.3147
1.1863 1.9272 146 2.3039
1.21 1.9536 148 2.6430
1.3249 1.9801 150 2.8306
2.0297 2.0 152 4.0284
1.3257 2.0265 154 4.3653
1.6556 2.0530 156 3.1694
1.2546 2.0795 158 2.4593
1.2407 2.1060 160 2.7860
1.2353 2.1325 162 2.6791
1.2571 2.1589 164 2.2779
1.5464 2.1854 166 2.6509
1.307 2.2119 168 3.1498
1.3582 2.2384 170 2.6785
1.1259 2.2649 172 2.3530
1.133 2.2914 174 2.5743
1.0692 2.3179 176 2.6552
1.2508 2.3444 178 2.4048
1.504 2.3709 180 2.7153
1.5213 2.3974 182 2.8302
1.4263 2.4238 184 2.7825
1.2581 2.4503 186 2.7872
1.3904 2.4768 188 2.9785
1.2969 2.5033 190 3.0633
1.4557 2.5298 192 3.3497
1.2728 2.5563 194 2.9371
1.154 2.5828 196 2.3541
1.5159 2.6093 198 2.5202
1.1535 2.6358 200 3.1148
1.2246 2.6623 202 3.1917
1.5538 2.6887 204 2.9176
1.3437 2.7152 206 2.9513
1.384 2.7417 208 3.0657
1.1712 2.7682 210 3.0711
1.1171 2.7947 212 2.6282
1.1222 2.8212 214 2.7070
1.0502 2.8477 216 3.1335
1.5044 2.8742 218 3.3894
1.1937 2.9007 220 2.9295
1.4499 2.9272 222 2.4740
1.1369 2.9536 224 2.6045
0.9361 2.9801 226 2.8151
0.9825 3.0 228 2.5062
1.1738 3.0265 230 2.1268
1.7623 3.0530 232 2.1633
1.2964 3.0795 234 2.5530
1.2397 3.1060 236 2.5847
1.1588 3.1325 238 2.3435
1.2689 3.1589 240 2.5115
1.2141 3.1854 242 2.6775
1.4553 3.2119 244 2.8488
1.1172 3.2384 246 2.7492
1.342 3.2649 248 2.9571
1.5291 3.2914 250 3.3224
1.3595 3.3179 252 3.7868
1.703 3.3444 254 3.9059
1.3682 3.3709 256 3.7033
1.4801 3.3974 258 3.1122
1.2444 3.4238 260 2.6195
1.0801 3.4503 262 2.4804
1.1966 3.4768 264 2.7418
1.1321 3.5033 266 2.9589
1.4521 3.5298 268 3.2319
1.5539 3.5563 270 3.7307
1.2244 3.5828 272 4.0520
1.1516 3.6093 274 3.7584
1.2933 3.6358 276 3.5864
1.3404 3.6623 278 3.7727
1.1585 3.6887 280 3.6076
1.0974 3.7152 282 3.2680
1.2817 3.7417 284 3.4060
1.2721 3.7682 286 3.7119
1.3948 3.7947 288 3.7408
1.14 3.8212 290 3.5425
1.5698 3.8477 292 3.2189
1.1783 3.8742 294 2.9808
0.9986 3.9007 296 2.8487
1.3001 3.9272 298 2.9359
1.0658 3.9536 300 2.9475
1.2286 3.9801 302 2.9914
1.7819 4.0 304 3.2627
1.2959 4.0265 306 3.5096
1.2636 4.0530 308 3.4164
1.6574 4.0795 310 3.2132
1.116 4.1060 312 3.0850
1.272 4.1325 314 3.1118
1.1456 4.1589 316 3.1339
1.4732 4.1854 318 3.2678
1.2832 4.2119 320 3.3656
1.1853 4.2384 322 3.3827
1.0865 4.2649 324 3.2203
1.1281 4.2914 326 3.0227
1.3469 4.3179 328 2.8155
1.3308 4.3444 330 2.7892
1.0484 4.3709 332 2.9204
1.279 4.3974 334 3.1109
1.165 4.4238 336 3.2519
1.2145 4.4503 338 3.2841
1.4212 4.4768 340 3.3603
1.3132 4.5033 342 3.3661
1.0477 4.5298 344 3.3598
1.6463 4.5563 346 3.4348
1.1416 4.5828 348 3.4522
1.2699 4.6093 350 3.4344
1.3752 4.6358 352 3.4206
1.4558 4.6623 354 3.4065
1.4562 4.6887 356 3.3909
0.9682 4.7152 358 3.3530
1.3652 4.7417 360 3.3158
1.2207 4.7682 362 3.2640
1.3417 4.7947 364 3.2225
0.948 4.8212 366 3.1754
1.2974 4.8477 368 3.1646
1.6318 4.8742 370 3.1947
1.5222 4.9007 372 3.2226
1.486 4.9272 374 3.2479

Framework versions

  • PEFT 0.14.0
  • Transformers 4.48.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
0
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Sruthi-sai-2004/tamil-colloquial-english-translate-model

Adapter
(112)
this model