telugu-colloquial-translator
This model is a fine-tuned version of unsloth/tinyllama-chat-bnb-4bit on the None dataset. It achieves the following results on the evaluation set:
- Loss: 9.2129
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 8
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 2
- num_epochs: 10
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
14.6163 | 0.2667 | 2 | 9.2468 |
14.5969 | 0.5333 | 4 | 9.2468 |
14.676 | 0.8 | 6 | 9.2468 |
14.5883 | 1.0 | 8 | 9.2468 |
14.6246 | 1.2667 | 10 | 9.2468 |
14.5945 | 1.5333 | 12 | 9.2468 |
14.6381 | 1.8 | 14 | 9.2468 |
14.5547 | 2.0 | 16 | 9.2468 |
12.8615 | 2.2667 | 18 | 9.2478 |
9.3963 | 2.5333 | 20 | 9.2486 |
13.8496 | 2.8 | 22 | 9.2491 |
9.2063 | 3.0 | 24 | 9.2487 |
6.4105 | 3.2667 | 26 | 9.2479 |
5.6432 | 3.5333 | 28 | 9.2472 |
5.2603 | 3.8 | 30 | 9.2459 |
4.8088 | 4.0 | 32 | 9.2440 |
4.5624 | 4.2667 | 34 | 9.2414 |
4.313 | 4.5333 | 36 | 9.2390 |
4.1646 | 4.8 | 38 | 9.2368 |
4.0837 | 5.0 | 40 | 9.2344 |
3.9956 | 5.2667 | 42 | 9.2319 |
3.9756 | 5.5333 | 44 | 9.2298 |
3.8652 | 5.8 | 46 | 9.2285 |
3.9191 | 6.0 | 48 | 9.2272 |
3.8893 | 6.2667 | 50 | 9.2251 |
3.8461 | 6.5333 | 52 | 9.2231 |
3.7788 | 6.8 | 54 | 9.2219 |
3.8249 | 7.0 | 56 | 9.2203 |
3.7855 | 7.2667 | 58 | 9.2183 |
3.7378 | 7.5333 | 60 | 9.2165 |
3.7678 | 7.8 | 62 | 9.2157 |
3.7418 | 8.0 | 64 | 9.2151 |
3.726 | 8.2667 | 66 | 9.2143 |
3.6778 | 8.5333 | 68 | 9.2134 |
3.7152 | 8.8 | 70 | 9.2129 |
Framework versions
- PEFT 0.14.0
- Transformers 4.48.3
- Pytorch 2.6.0+cu124
- Datasets 3.3.1
- Tokenizers 0.21.0
- Downloads last month
- 5
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for Bhavyasree09/telugu-colloquial-translator
Base model
unsloth/tinyllama-chat-bnb-4bit