telugu-colloquial-translator
This model is a fine-tuned version of unsloth/tinyllama-chat-bnb-4bit on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 13.8354
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 8
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 2
- num_epochs: 10
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
14.5051 | 0.1667 | 2 | 13.6164 |
14.5289 | 0.3333 | 4 | 13.6164 |
14.5076 | 0.5 | 6 | 13.6164 |
14.6086 | 0.6667 | 8 | 13.6164 |
14.6413 | 0.8333 | 10 | 13.6164 |
14.5621 | 1.0 | 12 | 13.6164 |
14.4257 | 1.1667 | 14 | 13.6164 |
14.4923 | 1.3333 | 16 | 13.5831 |
10.4555 | 1.5 | 18 | 13.6066 |
9.525 | 1.6667 | 20 | 13.6075 |
13.6701 | 1.8333 | 22 | 13.6110 |
8.0112 | 2.0 | 24 | 13.4903 |
6.3558 | 2.1667 | 26 | 13.4089 |
5.3682 | 2.3333 | 28 | 13.4581 |
4.8551 | 2.5 | 30 | 13.4554 |
4.5257 | 2.6667 | 32 | 13.4495 |
4.2491 | 2.8333 | 34 | 13.3523 |
4.0777 | 3.0 | 36 | 13.3952 |
3.9632 | 3.1667 | 38 | 13.4579 |
3.8663 | 3.3333 | 40 | 13.6159 |
3.8474 | 3.5 | 42 | 13.7999 |
3.8016 | 3.6667 | 44 | 13.8696 |
3.8016 | 3.8333 | 46 | 13.8831 |
3.7705 | 4.0 | 48 | 14.0455 |
3.6923 | 4.1667 | 50 | 14.2457 |
3.7001 | 4.3333 | 52 | 14.3837 |
3.7369 | 4.5 | 54 | 14.5811 |
3.7443 | 4.6667 | 56 | 14.7282 |
3.6995 | 4.8333 | 58 | 14.8502 |
3.6674 | 5.0 | 60 | 14.8549 |
3.6457 | 5.1667 | 62 | 14.7122 |
3.6976 | 5.3333 | 64 | 14.6955 |
3.6695 | 5.5 | 66 | 14.7200 |
3.7054 | 5.6667 | 68 | 14.7288 |
3.662 | 5.8333 | 70 | 14.6215 |
3.639 | 6.0 | 72 | 14.4830 |
3.5977 | 6.1667 | 74 | 14.3549 |
3.6848 | 6.3333 | 76 | 14.3106 |
3.6473 | 6.5 | 78 | 14.3912 |
3.5777 | 6.6667 | 80 | 14.4662 |
3.6259 | 6.8333 | 82 | 14.4309 |
3.6215 | 7.0 | 84 | 14.3144 |
3.5984 | 7.1667 | 86 | 14.1720 |
3.6425 | 7.3333 | 88 | 14.1530 |
3.6399 | 7.5 | 90 | 14.2482 |
3.5718 | 7.6667 | 92 | 14.2812 |
3.5997 | 7.8333 | 94 | 14.1919 |
3.5854 | 8.0 | 96 | 14.0361 |
3.5148 | 8.1667 | 98 | 13.9069 |
3.5795 | 8.3333 | 100 | 13.8721 |
3.543 | 8.5 | 102 | 13.9068 |
3.612 | 8.6667 | 104 | 13.9608 |
3.5358 | 8.8333 | 106 | 13.9630 |
3.5564 | 9.0 | 108 | 13.9936 |
3.5838 | 9.1667 | 110 | 13.9328 |
3.5536 | 9.3333 | 112 | 13.8414 |
3.5626 | 9.5 | 114 | 13.7961 |
3.5445 | 9.6667 | 116 | 13.8050 |
3.5275 | 9.8333 | 118 | 13.8261 |
3.5533 | 10.0 | 120 | 13.8354 |
Framework versions
- PEFT 0.14.0
- Transformers 4.48.3
- Pytorch 2.6.0+cu124
- Datasets 3.3.2
- Tokenizers 0.21.0
- Downloads last month
- 2
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for ankitha29/telugu-colloquial-translator
Base model
unsloth/tinyllama-chat-bnb-4bit