tamil-colloquial-english-translate-model
This model is a fine-tuned version of unsloth/tinyllama-chat-bnb-4bit on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 3.2479
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 8
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 2
- num_epochs: 5
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
8.6422 | 0.0265 | 2 | 6.0298 |
6.9944 | 0.0530 | 4 | 5.1665 |
6.7864 | 0.0795 | 6 | 5.1665 |
3.7423 | 0.1060 | 8 | 3.3416 |
2.7338 | 0.1325 | 10 | 2.7392 |
2.9393 | 0.1589 | 12 | 3.0218 |
2.2037 | 0.1854 | 14 | 2.7683 |
2.0967 | 0.2119 | 16 | 3.0727 |
1.9914 | 0.2384 | 18 | 3.4467 |
2.0141 | 0.2649 | 20 | 5.4803 |
2.4236 | 0.2914 | 22 | 7.4676 |
3.463 | 0.3179 | 24 | 7.5226 |
2.1304 | 0.3444 | 26 | 7.1676 |
2.4984 | 0.3709 | 28 | 6.9342 |
1.6678 | 0.3974 | 30 | 6.9310 |
2.3985 | 0.4238 | 32 | 6.4441 |
1.794 | 0.4503 | 34 | 5.7529 |
2.4018 | 0.4768 | 36 | 5.2054 |
2.4566 | 0.5033 | 38 | 4.9671 |
2.2224 | 0.5298 | 40 | 4.5316 |
1.94 | 0.5563 | 42 | 3.9569 |
1.9299 | 0.5828 | 44 | 3.7401 |
1.719 | 0.6093 | 46 | 3.6166 |
1.7334 | 0.6358 | 48 | 3.0501 |
1.6715 | 0.6623 | 50 | 2.5901 |
1.3979 | 0.6887 | 52 | 2.6318 |
1.9087 | 0.7152 | 54 | 2.9039 |
1.8333 | 0.7417 | 56 | 2.8776 |
1.6035 | 0.7682 | 58 | 2.7774 |
1.41 | 0.7947 | 60 | 2.5044 |
1.9075 | 0.8212 | 62 | 2.8705 |
1.608 | 0.8477 | 64 | 2.7610 |
1.7068 | 0.8742 | 66 | 2.9243 |
1.6267 | 0.9007 | 68 | 2.3522 |
1.4378 | 0.9272 | 70 | 2.6712 |
1.8967 | 0.9536 | 72 | 3.7529 |
1.4106 | 0.9801 | 74 | 2.9436 |
2.0129 | 1.0 | 76 | 3.1104 |
1.2537 | 1.0265 | 78 | 2.7373 |
1.5516 | 1.0530 | 80 | 2.4722 |
1.4263 | 1.0795 | 82 | 2.1472 |
0.9644 | 1.1060 | 84 | 2.3321 |
1.679 | 1.1325 | 86 | 2.9275 |
1.2739 | 1.1589 | 88 | 2.1865 |
1.4228 | 1.1854 | 90 | 2.0449 |
1.3859 | 1.2119 | 92 | 2.6632 |
1.6259 | 1.2384 | 94 | 2.6249 |
1.5091 | 1.2649 | 96 | 2.1462 |
1.6238 | 1.2914 | 98 | 2.8612 |
1.4244 | 1.3179 | 100 | 2.8897 |
1.6451 | 1.3444 | 102 | 2.2974 |
1.5182 | 1.3709 | 104 | 3.0305 |
1.1502 | 1.3974 | 106 | 2.7777 |
1.3721 | 1.4238 | 108 | 2.0768 |
1.8245 | 1.4503 | 110 | 2.5860 |
1.222 | 1.4768 | 112 | 2.6979 |
1.6133 | 1.5033 | 114 | 2.3744 |
1.2356 | 1.5298 | 116 | 2.5420 |
1.4606 | 1.5563 | 118 | 2.2398 |
1.3163 | 1.5828 | 120 | 2.4619 |
1.3804 | 1.6093 | 122 | 2.2209 |
1.4569 | 1.6358 | 124 | 2.8278 |
1.1365 | 1.6623 | 126 | 2.5291 |
1.5134 | 1.6887 | 128 | 2.5234 |
1.2794 | 1.7152 | 130 | 2.7923 |
1.1179 | 1.7417 | 132 | 2.2813 |
1.5328 | 1.7682 | 134 | 2.4505 |
1.4426 | 1.7947 | 136 | 3.1080 |
1.494 | 1.8212 | 138 | 2.6538 |
1.3861 | 1.8477 | 140 | 2.5577 |
1.3619 | 1.8742 | 142 | 2.7934 |
1.1387 | 1.9007 | 144 | 2.3147 |
1.1863 | 1.9272 | 146 | 2.3039 |
1.21 | 1.9536 | 148 | 2.6430 |
1.3249 | 1.9801 | 150 | 2.8306 |
2.0297 | 2.0 | 152 | 4.0284 |
1.3257 | 2.0265 | 154 | 4.3653 |
1.6556 | 2.0530 | 156 | 3.1694 |
1.2546 | 2.0795 | 158 | 2.4593 |
1.2407 | 2.1060 | 160 | 2.7860 |
1.2353 | 2.1325 | 162 | 2.6791 |
1.2571 | 2.1589 | 164 | 2.2779 |
1.5464 | 2.1854 | 166 | 2.6509 |
1.307 | 2.2119 | 168 | 3.1498 |
1.3582 | 2.2384 | 170 | 2.6785 |
1.1259 | 2.2649 | 172 | 2.3530 |
1.133 | 2.2914 | 174 | 2.5743 |
1.0692 | 2.3179 | 176 | 2.6552 |
1.2508 | 2.3444 | 178 | 2.4048 |
1.504 | 2.3709 | 180 | 2.7153 |
1.5213 | 2.3974 | 182 | 2.8302 |
1.4263 | 2.4238 | 184 | 2.7825 |
1.2581 | 2.4503 | 186 | 2.7872 |
1.3904 | 2.4768 | 188 | 2.9785 |
1.2969 | 2.5033 | 190 | 3.0633 |
1.4557 | 2.5298 | 192 | 3.3497 |
1.2728 | 2.5563 | 194 | 2.9371 |
1.154 | 2.5828 | 196 | 2.3541 |
1.5159 | 2.6093 | 198 | 2.5202 |
1.1535 | 2.6358 | 200 | 3.1148 |
1.2246 | 2.6623 | 202 | 3.1917 |
1.5538 | 2.6887 | 204 | 2.9176 |
1.3437 | 2.7152 | 206 | 2.9513 |
1.384 | 2.7417 | 208 | 3.0657 |
1.1712 | 2.7682 | 210 | 3.0711 |
1.1171 | 2.7947 | 212 | 2.6282 |
1.1222 | 2.8212 | 214 | 2.7070 |
1.0502 | 2.8477 | 216 | 3.1335 |
1.5044 | 2.8742 | 218 | 3.3894 |
1.1937 | 2.9007 | 220 | 2.9295 |
1.4499 | 2.9272 | 222 | 2.4740 |
1.1369 | 2.9536 | 224 | 2.6045 |
0.9361 | 2.9801 | 226 | 2.8151 |
0.9825 | 3.0 | 228 | 2.5062 |
1.1738 | 3.0265 | 230 | 2.1268 |
1.7623 | 3.0530 | 232 | 2.1633 |
1.2964 | 3.0795 | 234 | 2.5530 |
1.2397 | 3.1060 | 236 | 2.5847 |
1.1588 | 3.1325 | 238 | 2.3435 |
1.2689 | 3.1589 | 240 | 2.5115 |
1.2141 | 3.1854 | 242 | 2.6775 |
1.4553 | 3.2119 | 244 | 2.8488 |
1.1172 | 3.2384 | 246 | 2.7492 |
1.342 | 3.2649 | 248 | 2.9571 |
1.5291 | 3.2914 | 250 | 3.3224 |
1.3595 | 3.3179 | 252 | 3.7868 |
1.703 | 3.3444 | 254 | 3.9059 |
1.3682 | 3.3709 | 256 | 3.7033 |
1.4801 | 3.3974 | 258 | 3.1122 |
1.2444 | 3.4238 | 260 | 2.6195 |
1.0801 | 3.4503 | 262 | 2.4804 |
1.1966 | 3.4768 | 264 | 2.7418 |
1.1321 | 3.5033 | 266 | 2.9589 |
1.4521 | 3.5298 | 268 | 3.2319 |
1.5539 | 3.5563 | 270 | 3.7307 |
1.2244 | 3.5828 | 272 | 4.0520 |
1.1516 | 3.6093 | 274 | 3.7584 |
1.2933 | 3.6358 | 276 | 3.5864 |
1.3404 | 3.6623 | 278 | 3.7727 |
1.1585 | 3.6887 | 280 | 3.6076 |
1.0974 | 3.7152 | 282 | 3.2680 |
1.2817 | 3.7417 | 284 | 3.4060 |
1.2721 | 3.7682 | 286 | 3.7119 |
1.3948 | 3.7947 | 288 | 3.7408 |
1.14 | 3.8212 | 290 | 3.5425 |
1.5698 | 3.8477 | 292 | 3.2189 |
1.1783 | 3.8742 | 294 | 2.9808 |
0.9986 | 3.9007 | 296 | 2.8487 |
1.3001 | 3.9272 | 298 | 2.9359 |
1.0658 | 3.9536 | 300 | 2.9475 |
1.2286 | 3.9801 | 302 | 2.9914 |
1.7819 | 4.0 | 304 | 3.2627 |
1.2959 | 4.0265 | 306 | 3.5096 |
1.2636 | 4.0530 | 308 | 3.4164 |
1.6574 | 4.0795 | 310 | 3.2132 |
1.116 | 4.1060 | 312 | 3.0850 |
1.272 | 4.1325 | 314 | 3.1118 |
1.1456 | 4.1589 | 316 | 3.1339 |
1.4732 | 4.1854 | 318 | 3.2678 |
1.2832 | 4.2119 | 320 | 3.3656 |
1.1853 | 4.2384 | 322 | 3.3827 |
1.0865 | 4.2649 | 324 | 3.2203 |
1.1281 | 4.2914 | 326 | 3.0227 |
1.3469 | 4.3179 | 328 | 2.8155 |
1.3308 | 4.3444 | 330 | 2.7892 |
1.0484 | 4.3709 | 332 | 2.9204 |
1.279 | 4.3974 | 334 | 3.1109 |
1.165 | 4.4238 | 336 | 3.2519 |
1.2145 | 4.4503 | 338 | 3.2841 |
1.4212 | 4.4768 | 340 | 3.3603 |
1.3132 | 4.5033 | 342 | 3.3661 |
1.0477 | 4.5298 | 344 | 3.3598 |
1.6463 | 4.5563 | 346 | 3.4348 |
1.1416 | 4.5828 | 348 | 3.4522 |
1.2699 | 4.6093 | 350 | 3.4344 |
1.3752 | 4.6358 | 352 | 3.4206 |
1.4558 | 4.6623 | 354 | 3.4065 |
1.4562 | 4.6887 | 356 | 3.3909 |
0.9682 | 4.7152 | 358 | 3.3530 |
1.3652 | 4.7417 | 360 | 3.3158 |
1.2207 | 4.7682 | 362 | 3.2640 |
1.3417 | 4.7947 | 364 | 3.2225 |
0.948 | 4.8212 | 366 | 3.1754 |
1.2974 | 4.8477 | 368 | 3.1646 |
1.6318 | 4.8742 | 370 | 3.1947 |
1.5222 | 4.9007 | 372 | 3.2226 |
1.486 | 4.9272 | 374 | 3.2479 |
Framework versions
- PEFT 0.14.0
- Transformers 4.48.3
- Pytorch 2.6.0+cu124
- Datasets 3.3.2
- Tokenizers 0.21.0
- Downloads last month
- 0
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for Sruthi-sai-2004/tamil-colloquial-english-translate-model
Base model
unsloth/tinyllama-chat-bnb-4bit