english-tamil-colloquial-translator

This model is a fine-tuned version of unsloth/tinyllama-chat-bnb-4bit on the None dataset. It achieves the following results on the evaluation set:

Loss: 9.8446

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 8
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 2
num_epochs: 10
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
15.7547	0.1053	2	9.8456
15.479	0.2105	4	9.8456
15.4493	0.3158	6	9.8456
15.7354	0.4211	8	9.8456
15.6945	0.5263	10	9.8456
15.7836	0.6316	12	9.8456
15.3208	0.7368	14	9.8456
15.4578	0.8421	16	9.8456
10.0112	0.9474	18	9.8455
8.7126	1.0526	20	9.8454
16.3992	1.1579	22	9.8453
9.6316	1.2632	24	9.8452
7.4174	1.3684	26	9.8451
7.0426	1.4737	28	9.8450
5.7603	1.5789	30	9.8448
5.2669	1.6842	32	9.8448
4.9417	1.7895	34	9.8449
4.7346	1.8947	36	9.8450
4.7132	2.0	38	9.8454
4.6515	2.1053	40	9.8458
4.6755	2.2105	42	9.8458
4.605	2.3158	44	9.8458
4.5911	2.4211	46	9.8461
4.5087	2.5263	48	9.8468
4.2059	2.6316	50	9.8476
4.4623	2.7368	52	9.8477
4.5385	2.8421	54	9.8472
4.4144	2.9474	56	9.8468
4.228	3.0526	58	9.8463
4.5131	3.1579	60	9.8457
4.3911	3.2632	62	9.8455
4.2745	3.3684	64	9.8453
4.2357	3.4737	66	9.8448
4.4461	3.5789	68	9.8444
4.4832	3.6842	70	9.8443
4.2667	3.7895	72	9.8441
4.4088	3.8947	74	9.8438
4.2727	4.0	76	9.8436
4.3983	4.1053	78	9.8435
4.4424	4.2105	80	9.8433
4.337	4.3158	82	9.8429
4.1702	4.4211	84	9.8426
4.4149	4.5263	86	9.8424
4.3636	4.6316	88	9.8422
4.129	4.7368	90	9.8422
4.3597	4.8421	92	9.8423
4.3975	4.9474	94	9.8425
4.514	5.0526	96	9.8428
4.4162	5.1579	98	9.8430
4.319	5.2632	100	9.8433
4.3345	5.3684	102	9.8437
4.3324	5.4737	104	9.8440
4.4339	5.5789	106	9.8443
4.2552	5.6842	108	9.8445
4.1977	5.7895	110	9.8446
4.3751	5.8947	112	9.8445
4.1503	6.0	114	9.8445
4.2694	6.1053	116	9.8444
4.1817	6.2105	118	9.8443
4.139	6.3158	120	9.8443
4.2565	6.4211	122	9.8444
4.1783	6.5263	124	9.8443
4.1413	6.6316	126	9.8444
4.376	6.7368	128	9.8443
4.3513	6.8421	130	9.8443
4.2998	6.9474	132	9.8444
4.3274	7.0526	134	9.8443
4.1745	7.1579	136	9.8443
4.264	7.2632	138	9.8443
4.2688	7.3684	140	9.8443
4.2694	7.4737	142	9.8443
4.0628	7.5789	144	9.8443
4.1035	7.6842	146	9.8443
4.1901	7.7895	148	9.8445
4.0909	7.8947	150	9.8445
4.1311	8.0	152	9.8445
4.019	8.1053	154	9.8444
4.3897	8.2105	156	9.8445
4.1649	8.3158	158	9.8445
4.2591	8.4211	160	9.8445
4.3012	8.5263	162	9.8446
4.2335	8.6316	164	9.8448
4.2651	8.7368	166	9.8448
4.0715	8.8421	168	9.8448
4.3972	8.9474	170	9.8448
4.1081	9.0526	172	9.8449
4.0342	9.1579	174	9.8449
4.1982	9.2632	176	9.8448
4.1429	9.3684	178	9.8447
4.4475	9.4737	180	9.8446
4.1772	9.5789	182	9.8447
4.175	9.6842	184	9.8446
4.0956	9.7895	186	9.8446
4.029	9.8947	188	9.8446
3.9935	10.0	190	9.8446

Framework versions

PEFT 0.14.0
Transformers 4.48.3
Pytorch 2.6.0+cu124
Datasets 3.3.1
Tokenizers 0.21.0

JoannaCalamus
/

english-tamil-colloquial-translator

english-tamil-colloquial-translator

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for JoannaCalamus/english-tamil-colloquial-translator

Evaluation results