english-tamil-colloquial-translator

This model is a fine-tuned version of unsloth/tinyllama-chat-bnb-4bit on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 8
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 2
num_epochs: 10
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
14.7234	0.3333	2	9.4108
14.9363	0.6667	4	9.4108
14.9228	1.0	6	9.4108
14.9161	1.3333	8	9.4108
14.7165	1.6667	10	9.4108
14.9611	2.0	12	9.4108
14.9045	2.3333	14	9.4108
14.9345	2.6667	16	9.4112
10.4373	3.0	18	9.4115
8.9482	3.3333	20	9.4115
10.0361	3.6667	22	9.4114
7.861	4.0	24	9.4112
6.7595	4.3333	26	9.4108
6.2089	4.6667	28	9.4102
5.8107	5.0	30	9.4098
5.1732	5.3333	32	9.4095
4.8956	5.6667	34	9.4090
4.6136	6.0	36	9.4080
4.3897	6.3333	38	9.4066
4.2181	6.6667	40	9.4051
4.155	7.0	42	9.4033
4.0174	7.3333	44	9.4025
3.9963	7.6667	46	9.4021
3.9986	8.0	48	9.4010
3.8247	8.3333	50	9.3994
3.9088	8.6667	52	9.3990
3.9343	9.0	54	9.3985
3.866	9.3333	56	9.3977
3.7683	9.6667	58	9.3968
3.882	10.0	60	9.3960