english-telugu-colloquial-translator

This model is a fine-tuned version of unsloth/tinyllama-chat-bnb-4bit on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 10.1778

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 2
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
15.7759 0.1333 2 10.1801
15.993 0.2667 4 10.1801
16.0528 0.4 6 10.1801
16.0569 0.5333 8 10.1801
16.0127 0.6667 10 10.1801
16.0612 0.8 12 10.1801
16.0757 0.9333 14 10.1801
16.0546 1.0667 16 10.1801
9.8125 1.2 18 10.1802
8.5746 1.3333 20 10.1802
10.1687 1.4667 22 10.1802
9.4165 1.6 24 10.1802
7.2767 1.7333 26 10.1803
6.1989 1.8667 28 10.1802
5.7179 2.0 30 10.1802
5.3125 2.1333 32 10.1802
5.0667 2.2667 34 10.1803
4.9048 2.4 36 10.1803
4.842 2.5333 38 10.1801
4.7353 2.6667 40 10.1800
4.6989 2.8 42 10.1798
4.6702 2.9333 44 10.1796
4.7066 3.0667 46 10.1795
4.6867 3.2 48 10.1794
4.6946 3.3333 50 10.1794
4.651 3.4667 52 10.1794
4.6442 3.6 54 10.1795
4.6016 3.7333 56 10.1794
4.6168 3.8667 58 10.1793
4.6177 4.0 60 10.1794
4.692 4.1333 62 10.1792
4.6193 4.2667 64 10.1792
4.5829 4.4 66 10.1792
4.5955 4.5333 68 10.1793
4.6238 4.6667 70 10.1793
4.5963 4.8 72 10.1793
4.6115 4.9333 74 10.1792
4.5897 5.0667 76 10.1791
4.5634 5.2 78 10.1791
4.6068 5.3333 80 10.1790
4.5435 5.4667 82 10.1790
4.6129 5.6 84 10.1790
4.6147 5.7333 86 10.1789
4.6223 5.8667 88 10.1788
4.5862 6.0 90 10.1787
4.5616 6.1333 92 10.1786
4.5576 6.2667 94 10.1784
4.5668 6.4 96 10.1783
4.594 6.5333 98 10.1783
4.5498 6.6667 100 10.1784
4.5728 6.8 102 10.1784
4.6117 6.9333 104 10.1784
4.5075 7.0667 106 10.1784
4.5465 7.2 108 10.1784
4.509 7.3333 110 10.1783
4.5868 7.4667 112 10.1783
4.5992 7.6 114 10.1783
4.5945 7.7333 116 10.1782
4.5364 7.8667 118 10.1782
4.5324 8.0 120 10.1783
4.4714 8.1333 122 10.1782
4.5462 8.2667 124 10.1782
4.5717 8.4 126 10.1782
4.5188 8.5333 128 10.1782
4.4999 8.6667 130 10.1781
4.5296 8.8 132 10.1780
4.5303 8.9333 134 10.1780
4.5506 9.0667 136 10.1780
4.521 9.2 138 10.1779
4.5278 9.3333 140 10.1779
4.4858 9.4667 142 10.1779
4.5483 9.6 144 10.1778
4.5304 9.7333 146 10.1778
4.4881 9.8667 148 10.1778
4.5163 10.0 150 10.1778

Framework versions

  • PEFT 0.14.0
  • Transformers 4.48.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.1
  • Tokenizers 0.21.0
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Mythili12/english-telugu-colloquial-translator

Adapter
(112)
this model