english-telugu-colloquial-translator

This model is a fine-tuned version of unsloth/tinyllama-chat-bnb-4bit on the None dataset. It achieves the following results on the evaluation set:

Loss: 10.1778

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 8
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 2
num_epochs: 10
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
15.7759	0.1333	2	10.1801
15.993	0.2667	4	10.1801
16.0528	0.4	6	10.1801
16.0569	0.5333	8	10.1801
16.0127	0.6667	10	10.1801
16.0612	0.8	12	10.1801
16.0757	0.9333	14	10.1801
16.0546	1.0667	16	10.1801
9.8125	1.2	18	10.1802
8.5746	1.3333	20	10.1802
10.1687	1.4667	22	10.1802
9.4165	1.6	24	10.1802
7.2767	1.7333	26	10.1803
6.1989	1.8667	28	10.1802
5.7179	2.0	30	10.1802
5.3125	2.1333	32	10.1802
5.0667	2.2667	34	10.1803
4.9048	2.4	36	10.1803
4.842	2.5333	38	10.1801
4.7353	2.6667	40	10.1800
4.6989	2.8	42	10.1798
4.6702	2.9333	44	10.1796
4.7066	3.0667	46	10.1795
4.6867	3.2	48	10.1794
4.6946	3.3333	50	10.1794
4.651	3.4667	52	10.1794
4.6442	3.6	54	10.1795
4.6016	3.7333	56	10.1794
4.6168	3.8667	58	10.1793
4.6177	4.0	60	10.1794
4.692	4.1333	62	10.1792
4.6193	4.2667	64	10.1792
4.5829	4.4	66	10.1792
4.5955	4.5333	68	10.1793
4.6238	4.6667	70	10.1793
4.5963	4.8	72	10.1793
4.6115	4.9333	74	10.1792
4.5897	5.0667	76	10.1791
4.5634	5.2	78	10.1791
4.6068	5.3333	80	10.1790
4.5435	5.4667	82	10.1790
4.6129	5.6	84	10.1790
4.6147	5.7333	86	10.1789
4.6223	5.8667	88	10.1788
4.5862	6.0	90	10.1787
4.5616	6.1333	92	10.1786
4.5576	6.2667	94	10.1784
4.5668	6.4	96	10.1783
4.594	6.5333	98	10.1783
4.5498	6.6667	100	10.1784
4.5728	6.8	102	10.1784
4.6117	6.9333	104	10.1784
4.5075	7.0667	106	10.1784
4.5465	7.2	108	10.1784
4.509	7.3333	110	10.1783
4.5868	7.4667	112	10.1783
4.5992	7.6	114	10.1783
4.5945	7.7333	116	10.1782
4.5364	7.8667	118	10.1782
4.5324	8.0	120	10.1783
4.4714	8.1333	122	10.1782
4.5462	8.2667	124	10.1782
4.5717	8.4	126	10.1782
4.5188	8.5333	128	10.1782
4.4999	8.6667	130	10.1781
4.5296	8.8	132	10.1780
4.5303	8.9333	134	10.1780
4.5506	9.0667	136	10.1780
4.521	9.2	138	10.1779
4.5278	9.3333	140	10.1779
4.4858	9.4667	142	10.1779
4.5483	9.6	144	10.1778
4.5304	9.7333	146	10.1778
4.4881	9.8667	148	10.1778
4.5163	10.0	150	10.1778

Framework versions

PEFT 0.14.0
Transformers 4.48.3
Pytorch 2.6.0+cu124
Datasets 3.3.1
Tokenizers 0.21.0

Mythili12
/

english-telugu-colloquial-translator

english-telugu-colloquial-translator

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Mythili12/english-telugu-colloquial-translator

Evaluation results