english-tamil-colloquial-translator
This model is a fine-tuned version of unsloth/tinyllama-chat-bnb-4bit on the None dataset. It achieves the following results on the evaluation set:
- Loss: 9.8446
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 8
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 2
- num_epochs: 10
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
15.7547 | 0.1053 | 2 | 9.8456 |
15.479 | 0.2105 | 4 | 9.8456 |
15.4493 | 0.3158 | 6 | 9.8456 |
15.7354 | 0.4211 | 8 | 9.8456 |
15.6945 | 0.5263 | 10 | 9.8456 |
15.7836 | 0.6316 | 12 | 9.8456 |
15.3208 | 0.7368 | 14 | 9.8456 |
15.4578 | 0.8421 | 16 | 9.8456 |
10.0112 | 0.9474 | 18 | 9.8455 |
8.7126 | 1.0526 | 20 | 9.8454 |
16.3992 | 1.1579 | 22 | 9.8453 |
9.6316 | 1.2632 | 24 | 9.8452 |
7.4174 | 1.3684 | 26 | 9.8451 |
7.0426 | 1.4737 | 28 | 9.8450 |
5.7603 | 1.5789 | 30 | 9.8448 |
5.2669 | 1.6842 | 32 | 9.8448 |
4.9417 | 1.7895 | 34 | 9.8449 |
4.7346 | 1.8947 | 36 | 9.8450 |
4.7132 | 2.0 | 38 | 9.8454 |
4.6515 | 2.1053 | 40 | 9.8458 |
4.6755 | 2.2105 | 42 | 9.8458 |
4.605 | 2.3158 | 44 | 9.8458 |
4.5911 | 2.4211 | 46 | 9.8461 |
4.5087 | 2.5263 | 48 | 9.8468 |
4.2059 | 2.6316 | 50 | 9.8476 |
4.4623 | 2.7368 | 52 | 9.8477 |
4.5385 | 2.8421 | 54 | 9.8472 |
4.4144 | 2.9474 | 56 | 9.8468 |
4.228 | 3.0526 | 58 | 9.8463 |
4.5131 | 3.1579 | 60 | 9.8457 |
4.3911 | 3.2632 | 62 | 9.8455 |
4.2745 | 3.3684 | 64 | 9.8453 |
4.2357 | 3.4737 | 66 | 9.8448 |
4.4461 | 3.5789 | 68 | 9.8444 |
4.4832 | 3.6842 | 70 | 9.8443 |
4.2667 | 3.7895 | 72 | 9.8441 |
4.4088 | 3.8947 | 74 | 9.8438 |
4.2727 | 4.0 | 76 | 9.8436 |
4.3983 | 4.1053 | 78 | 9.8435 |
4.4424 | 4.2105 | 80 | 9.8433 |
4.337 | 4.3158 | 82 | 9.8429 |
4.1702 | 4.4211 | 84 | 9.8426 |
4.4149 | 4.5263 | 86 | 9.8424 |
4.3636 | 4.6316 | 88 | 9.8422 |
4.129 | 4.7368 | 90 | 9.8422 |
4.3597 | 4.8421 | 92 | 9.8423 |
4.3975 | 4.9474 | 94 | 9.8425 |
4.514 | 5.0526 | 96 | 9.8428 |
4.4162 | 5.1579 | 98 | 9.8430 |
4.319 | 5.2632 | 100 | 9.8433 |
4.3345 | 5.3684 | 102 | 9.8437 |
4.3324 | 5.4737 | 104 | 9.8440 |
4.4339 | 5.5789 | 106 | 9.8443 |
4.2552 | 5.6842 | 108 | 9.8445 |
4.1977 | 5.7895 | 110 | 9.8446 |
4.3751 | 5.8947 | 112 | 9.8445 |
4.1503 | 6.0 | 114 | 9.8445 |
4.2694 | 6.1053 | 116 | 9.8444 |
4.1817 | 6.2105 | 118 | 9.8443 |
4.139 | 6.3158 | 120 | 9.8443 |
4.2565 | 6.4211 | 122 | 9.8444 |
4.1783 | 6.5263 | 124 | 9.8443 |
4.1413 | 6.6316 | 126 | 9.8444 |
4.376 | 6.7368 | 128 | 9.8443 |
4.3513 | 6.8421 | 130 | 9.8443 |
4.2998 | 6.9474 | 132 | 9.8444 |
4.3274 | 7.0526 | 134 | 9.8443 |
4.1745 | 7.1579 | 136 | 9.8443 |
4.264 | 7.2632 | 138 | 9.8443 |
4.2688 | 7.3684 | 140 | 9.8443 |
4.2694 | 7.4737 | 142 | 9.8443 |
4.0628 | 7.5789 | 144 | 9.8443 |
4.1035 | 7.6842 | 146 | 9.8443 |
4.1901 | 7.7895 | 148 | 9.8445 |
4.0909 | 7.8947 | 150 | 9.8445 |
4.1311 | 8.0 | 152 | 9.8445 |
4.019 | 8.1053 | 154 | 9.8444 |
4.3897 | 8.2105 | 156 | 9.8445 |
4.1649 | 8.3158 | 158 | 9.8445 |
4.2591 | 8.4211 | 160 | 9.8445 |
4.3012 | 8.5263 | 162 | 9.8446 |
4.2335 | 8.6316 | 164 | 9.8448 |
4.2651 | 8.7368 | 166 | 9.8448 |
4.0715 | 8.8421 | 168 | 9.8448 |
4.3972 | 8.9474 | 170 | 9.8448 |
4.1081 | 9.0526 | 172 | 9.8449 |
4.0342 | 9.1579 | 174 | 9.8449 |
4.1982 | 9.2632 | 176 | 9.8448 |
4.1429 | 9.3684 | 178 | 9.8447 |
4.4475 | 9.4737 | 180 | 9.8446 |
4.1772 | 9.5789 | 182 | 9.8447 |
4.175 | 9.6842 | 184 | 9.8446 |
4.0956 | 9.7895 | 186 | 9.8446 |
4.029 | 9.8947 | 188 | 9.8446 |
3.9935 | 10.0 | 190 | 9.8446 |
Framework versions
- PEFT 0.14.0
- Transformers 4.48.3
- Pytorch 2.6.0+cu124
- Datasets 3.3.1
- Tokenizers 0.21.0
- Downloads last month
- 2
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for JoannaCalamus/english-tamil-colloquial-translator
Base model
unsloth/tinyllama-chat-bnb-4bit