english-tamil-colloquial-translator

This model is a fine-tuned version of unsloth/tinyllama-chat-bnb-4bit on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 9.8446

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 2
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
15.7547 0.1053 2 9.8456
15.479 0.2105 4 9.8456
15.4493 0.3158 6 9.8456
15.7354 0.4211 8 9.8456
15.6945 0.5263 10 9.8456
15.7836 0.6316 12 9.8456
15.3208 0.7368 14 9.8456
15.4578 0.8421 16 9.8456
10.0112 0.9474 18 9.8455
8.7126 1.0526 20 9.8454
16.3992 1.1579 22 9.8453
9.6316 1.2632 24 9.8452
7.4174 1.3684 26 9.8451
7.0426 1.4737 28 9.8450
5.7603 1.5789 30 9.8448
5.2669 1.6842 32 9.8448
4.9417 1.7895 34 9.8449
4.7346 1.8947 36 9.8450
4.7132 2.0 38 9.8454
4.6515 2.1053 40 9.8458
4.6755 2.2105 42 9.8458
4.605 2.3158 44 9.8458
4.5911 2.4211 46 9.8461
4.5087 2.5263 48 9.8468
4.2059 2.6316 50 9.8476
4.4623 2.7368 52 9.8477
4.5385 2.8421 54 9.8472
4.4144 2.9474 56 9.8468
4.228 3.0526 58 9.8463
4.5131 3.1579 60 9.8457
4.3911 3.2632 62 9.8455
4.2745 3.3684 64 9.8453
4.2357 3.4737 66 9.8448
4.4461 3.5789 68 9.8444
4.4832 3.6842 70 9.8443
4.2667 3.7895 72 9.8441
4.4088 3.8947 74 9.8438
4.2727 4.0 76 9.8436
4.3983 4.1053 78 9.8435
4.4424 4.2105 80 9.8433
4.337 4.3158 82 9.8429
4.1702 4.4211 84 9.8426
4.4149 4.5263 86 9.8424
4.3636 4.6316 88 9.8422
4.129 4.7368 90 9.8422
4.3597 4.8421 92 9.8423
4.3975 4.9474 94 9.8425
4.514 5.0526 96 9.8428
4.4162 5.1579 98 9.8430
4.319 5.2632 100 9.8433
4.3345 5.3684 102 9.8437
4.3324 5.4737 104 9.8440
4.4339 5.5789 106 9.8443
4.2552 5.6842 108 9.8445
4.1977 5.7895 110 9.8446
4.3751 5.8947 112 9.8445
4.1503 6.0 114 9.8445
4.2694 6.1053 116 9.8444
4.1817 6.2105 118 9.8443
4.139 6.3158 120 9.8443
4.2565 6.4211 122 9.8444
4.1783 6.5263 124 9.8443
4.1413 6.6316 126 9.8444
4.376 6.7368 128 9.8443
4.3513 6.8421 130 9.8443
4.2998 6.9474 132 9.8444
4.3274 7.0526 134 9.8443
4.1745 7.1579 136 9.8443
4.264 7.2632 138 9.8443
4.2688 7.3684 140 9.8443
4.2694 7.4737 142 9.8443
4.0628 7.5789 144 9.8443
4.1035 7.6842 146 9.8443
4.1901 7.7895 148 9.8445
4.0909 7.8947 150 9.8445
4.1311 8.0 152 9.8445
4.019 8.1053 154 9.8444
4.3897 8.2105 156 9.8445
4.1649 8.3158 158 9.8445
4.2591 8.4211 160 9.8445
4.3012 8.5263 162 9.8446
4.2335 8.6316 164 9.8448
4.2651 8.7368 166 9.8448
4.0715 8.8421 168 9.8448
4.3972 8.9474 170 9.8448
4.1081 9.0526 172 9.8449
4.0342 9.1579 174 9.8449
4.1982 9.2632 176 9.8448
4.1429 9.3684 178 9.8447
4.4475 9.4737 180 9.8446
4.1772 9.5789 182 9.8447
4.175 9.6842 184 9.8446
4.0956 9.7895 186 9.8446
4.029 9.8947 188 9.8446
3.9935 10.0 190 9.8446

Framework versions

  • PEFT 0.14.0
  • Transformers 4.48.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.1
  • Tokenizers 0.21.0
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for JoannaCalamus/english-tamil-colloquial-translator

Adapter
(112)
this model