english-tamil-colloquial-translator

This model is a fine-tuned version of unsloth/tinyllama-chat-bnb-4bit on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 15.5350

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 2
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
5.7041 0.1429 2 13.0282
5.704 0.2857 4 13.0282
5.7187 0.4286 6 13.0282
5.586 0.5714 8 13.0282
5.6787 0.7143 10 13.0451
5.0249 0.8571 12 13.2511
4.648 1.0 14 13.5719
4.3156 1.1429 16 13.6280
4.1271 1.2857 18 13.7019
4.011 1.4286 20 13.8716
3.9699 1.5714 22 14.1971
3.9664 1.7143 24 14.4633
3.9311 1.8571 26 14.7040
3.9328 2.0 28 15.0162
3.8147 2.1429 30 15.1551
3.8951 2.2857 32 15.2817
3.8677 2.4286 34 15.4876
3.9061 2.5714 36 15.5635
3.9226 2.7143 38 15.4968
3.8848 2.8571 40 15.5246
3.8577 3.0 42 15.7132
3.846 3.1429 44 15.7763
3.8536 3.2857 46 15.8040
3.8046 3.4286 48 15.7172
3.8163 3.5714 50 15.8060
3.8613 3.7143 52 15.8977
3.8648 3.8571 54 15.9296
3.822 4.0 56 15.8540
3.8502 4.1429 58 15.8113
3.6146 4.2857 60 15.8318
3.8076 4.4286 62 16.0217
3.7733 4.5714 64 16.1423
3.7919 4.7143 66 16.1771
3.8009 4.8571 68 16.1739
3.7928 5.0 70 16.1656
3.784 5.1429 72 16.1487
3.803 5.2857 74 16.1381
3.7914 5.4286 76 16.1417
3.7116 5.5714 78 16.1394
3.8199 5.7143 80 16.0901
3.7847 5.8571 82 16.0322
3.7701 6.0 84 15.9598
3.7567 6.1429 86 15.9265
3.7088 6.2857 88 15.8973
3.7141 6.4286 90 15.9301
3.7817 6.5714 92 15.9283
3.7652 6.7143 94 15.8789
3.7088 6.8571 96 15.8300
3.7512 7.0 98 15.7586
3.6948 7.1429 100 15.7267
3.7015 7.2857 102 15.7416
3.5739 7.4286 104 15.7994
3.7293 7.5714 106 15.8474
3.6986 7.7143 108 15.8830
3.7108 7.8571 110 15.8448
3.7302 8.0 112 15.7926
3.6948 8.1429 114 15.7935
3.7046 8.2857 116 15.7859
3.7349 8.4286 118 15.7760
3.7133 8.5714 120 15.7194
3.7227 8.7143 122 15.6614
3.6846 8.8571 124 15.6043
3.6667 9.0 126 15.5441
3.6827 9.1429 128 15.4927
3.6769 9.2857 130 15.4628
3.6676 9.4286 132 15.4675
3.7032 9.5714 134 15.4744
3.6917 9.7143 136 15.4825
3.6667 9.8571 138 15.5181
3.6514 10.0 140 15.5350

Framework versions

  • PEFT 0.14.0
  • Transformers 4.48.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Muskan8/english-tamil-colloquial-translator

Adapter
(112)
this model