telugu-colloquial-translator

This model is a fine-tuned version of unsloth/tinyllama-chat-bnb-4bit on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 13.8354

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 2
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
14.5051 0.1667 2 13.6164
14.5289 0.3333 4 13.6164
14.5076 0.5 6 13.6164
14.6086 0.6667 8 13.6164
14.6413 0.8333 10 13.6164
14.5621 1.0 12 13.6164
14.4257 1.1667 14 13.6164
14.4923 1.3333 16 13.5831
10.4555 1.5 18 13.6066
9.525 1.6667 20 13.6075
13.6701 1.8333 22 13.6110
8.0112 2.0 24 13.4903
6.3558 2.1667 26 13.4089
5.3682 2.3333 28 13.4581
4.8551 2.5 30 13.4554
4.5257 2.6667 32 13.4495
4.2491 2.8333 34 13.3523
4.0777 3.0 36 13.3952
3.9632 3.1667 38 13.4579
3.8663 3.3333 40 13.6159
3.8474 3.5 42 13.7999
3.8016 3.6667 44 13.8696
3.8016 3.8333 46 13.8831
3.7705 4.0 48 14.0455
3.6923 4.1667 50 14.2457
3.7001 4.3333 52 14.3837
3.7369 4.5 54 14.5811
3.7443 4.6667 56 14.7282
3.6995 4.8333 58 14.8502
3.6674 5.0 60 14.8549
3.6457 5.1667 62 14.7122
3.6976 5.3333 64 14.6955
3.6695 5.5 66 14.7200
3.7054 5.6667 68 14.7288
3.662 5.8333 70 14.6215
3.639 6.0 72 14.4830
3.5977 6.1667 74 14.3549
3.6848 6.3333 76 14.3106
3.6473 6.5 78 14.3912
3.5777 6.6667 80 14.4662
3.6259 6.8333 82 14.4309
3.6215 7.0 84 14.3144
3.5984 7.1667 86 14.1720
3.6425 7.3333 88 14.1530
3.6399 7.5 90 14.2482
3.5718 7.6667 92 14.2812
3.5997 7.8333 94 14.1919
3.5854 8.0 96 14.0361
3.5148 8.1667 98 13.9069
3.5795 8.3333 100 13.8721
3.543 8.5 102 13.9068
3.612 8.6667 104 13.9608
3.5358 8.8333 106 13.9630
3.5564 9.0 108 13.9936
3.5838 9.1667 110 13.9328
3.5536 9.3333 112 13.8414
3.5626 9.5 114 13.7961
3.5445 9.6667 116 13.8050
3.5275 9.8333 118 13.8261
3.5533 10.0 120 13.8354

Framework versions

  • PEFT 0.14.0
  • Transformers 4.48.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ankitha29/telugu-colloquial-translator

Adapter
(112)
this model