english-tamil-colloquial-translator

This model is a fine-tuned version of unsloth/tinyllama-chat-bnb-4bit on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 8
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 2
num_epochs: 100
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
14.9008	2.0	2	9.3971
14.9008	4.0	4	9.3971
14.9008	6.0	6	9.3971
14.9008	8.0	8	9.3971
14.9008	10.0	10	9.3971
14.9008	12.0	12	9.3971
14.9008	14.0	14	9.3971
14.9008	16.0	16	9.3977
10.3887	18.0	18	9.3991
9.7256	20.0	20	9.4000
14.4732	22.0	22	9.4008
8.5735	24.0	24	9.4009
6.1218	26.0	26	9.4010
5.4203	28.0	28	9.4013
4.9788	30.0	30	9.4006
4.6064	32.0	32	9.3992
4.2969	34.0	34	9.3959
4.0606	36.0	36	9.3922
3.9275	38.0	38	9.3896
3.832	40.0	40	9.3876
3.7489	42.0	42	9.3875
3.6895	44.0	44	9.3871
3.6586	46.0	46	9.3873
3.6318	48.0	48	9.3879
3.6202	50.0	50	9.3882
3.6077	52.0	52	9.3885
3.5982	54.0	54	9.3885
3.5925	56.0	56	9.3886
3.5892	58.0	58	9.3887
3.588	60.0	60	9.3890
3.5894	62.0	62	9.3894
3.5857	64.0	64	9.3893
3.5832	66.0	66	9.3893
3.5796	68.0	68	9.3897
3.5791	70.0	70	9.3897
3.58	72.0	72	9.3899
3.5805	74.0	74	9.3899
3.5786	76.0	76	9.3903
3.5769	78.0	78	9.3898
3.5753	80.0	80	9.3898
3.5743	82.0	82	9.3898
3.5745	84.0	84	9.3899
3.5758	86.0	86	9.3898
3.5753	88.0	88	9.3900
3.5747	90.0	90	9.3898
3.5747	92.0	92	9.3897
3.5727	94.0	94	9.3896
3.5734	96.0	96	9.3898
3.5738	98.0	98	9.3897
3.5737	100.0	100	9.3897