english-tamil-colloquial-translator

This model is a fine-tuned version of unsloth/tinyllama-chat-bnb-4bit on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 9.3960

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 2
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
14.7234 0.3333 2 9.4108
14.9363 0.6667 4 9.4108
14.9228 1.0 6 9.4108
14.9161 1.3333 8 9.4108
14.7165 1.6667 10 9.4108
14.9611 2.0 12 9.4108
14.9045 2.3333 14 9.4108
14.9345 2.6667 16 9.4112
10.4373 3.0 18 9.4115
8.9482 3.3333 20 9.4115
10.0361 3.6667 22 9.4114
7.861 4.0 24 9.4112
6.7595 4.3333 26 9.4108
6.2089 4.6667 28 9.4102
5.8107 5.0 30 9.4098
5.1732 5.3333 32 9.4095
4.8956 5.6667 34 9.4090
4.6136 6.0 36 9.4080
4.3897 6.3333 38 9.4066
4.2181 6.6667 40 9.4051
4.155 7.0 42 9.4033
4.0174 7.3333 44 9.4025
3.9963 7.6667 46 9.4021
3.9986 8.0 48 9.4010
3.8247 8.3333 50 9.3994
3.9088 8.6667 52 9.3990
3.9343 9.0 54 9.3985
3.866 9.3333 56 9.3977
3.7683 9.6667 58 9.3968
3.882 10.0 60 9.3960

Framework versions

  • PEFT 0.14.0
  • Transformers 4.48.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.1
  • Tokenizers 0.21.0
Downloads last month
21
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Keerthana4/english-tamil-colloquial-translator

Adapter
(112)
this model