tamil-colloquial-english-translate-model

This model is a fine-tuned version of unsloth/tinyllama-chat-bnb-4bit on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 3.2479

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 8
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 2
num_epochs: 5
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
8.6422	0.0265	2	6.0298
6.9944	0.0530	4	5.1665
6.7864	0.0795	6	5.1665
3.7423	0.1060	8	3.3416
2.7338	0.1325	10	2.7392
2.9393	0.1589	12	3.0218
2.2037	0.1854	14	2.7683
2.0967	0.2119	16	3.0727
1.9914	0.2384	18	3.4467
2.0141	0.2649	20	5.4803
2.4236	0.2914	22	7.4676
3.463	0.3179	24	7.5226
2.1304	0.3444	26	7.1676
2.4984	0.3709	28	6.9342
1.6678	0.3974	30	6.9310
2.3985	0.4238	32	6.4441
1.794	0.4503	34	5.7529
2.4018	0.4768	36	5.2054
2.4566	0.5033	38	4.9671
2.2224	0.5298	40	4.5316
1.94	0.5563	42	3.9569
1.9299	0.5828	44	3.7401
1.719	0.6093	46	3.6166
1.7334	0.6358	48	3.0501
1.6715	0.6623	50	2.5901
1.3979	0.6887	52	2.6318
1.9087	0.7152	54	2.9039
1.8333	0.7417	56	2.8776
1.6035	0.7682	58	2.7774
1.41	0.7947	60	2.5044
1.9075	0.8212	62	2.8705
1.608	0.8477	64	2.7610
1.7068	0.8742	66	2.9243
1.6267	0.9007	68	2.3522
1.4378	0.9272	70	2.6712
1.8967	0.9536	72	3.7529
1.4106	0.9801	74	2.9436
2.0129	1.0	76	3.1104
1.2537	1.0265	78	2.7373
1.5516	1.0530	80	2.4722
1.4263	1.0795	82	2.1472
0.9644	1.1060	84	2.3321
1.679	1.1325	86	2.9275
1.2739	1.1589	88	2.1865
1.4228	1.1854	90	2.0449
1.3859	1.2119	92	2.6632
1.6259	1.2384	94	2.6249
1.5091	1.2649	96	2.1462
1.6238	1.2914	98	2.8612
1.4244	1.3179	100	2.8897
1.6451	1.3444	102	2.2974
1.5182	1.3709	104	3.0305
1.1502	1.3974	106	2.7777
1.3721	1.4238	108	2.0768
1.8245	1.4503	110	2.5860
1.222	1.4768	112	2.6979
1.6133	1.5033	114	2.3744
1.2356	1.5298	116	2.5420
1.4606	1.5563	118	2.2398
1.3163	1.5828	120	2.4619
1.3804	1.6093	122	2.2209
1.4569	1.6358	124	2.8278
1.1365	1.6623	126	2.5291
1.5134	1.6887	128	2.5234
1.2794	1.7152	130	2.7923
1.1179	1.7417	132	2.2813
1.5328	1.7682	134	2.4505
1.4426	1.7947	136	3.1080
1.494	1.8212	138	2.6538
1.3861	1.8477	140	2.5577
1.3619	1.8742	142	2.7934
1.1387	1.9007	144	2.3147
1.1863	1.9272	146	2.3039
1.21	1.9536	148	2.6430
1.3249	1.9801	150	2.8306
2.0297	2.0	152	4.0284
1.3257	2.0265	154	4.3653
1.6556	2.0530	156	3.1694
1.2546	2.0795	158	2.4593
1.2407	2.1060	160	2.7860
1.2353	2.1325	162	2.6791
1.2571	2.1589	164	2.2779
1.5464	2.1854	166	2.6509
1.307	2.2119	168	3.1498
1.3582	2.2384	170	2.6785
1.1259	2.2649	172	2.3530
1.133	2.2914	174	2.5743
1.0692	2.3179	176	2.6552
1.2508	2.3444	178	2.4048
1.504	2.3709	180	2.7153
1.5213	2.3974	182	2.8302
1.4263	2.4238	184	2.7825
1.2581	2.4503	186	2.7872
1.3904	2.4768	188	2.9785
1.2969	2.5033	190	3.0633
1.4557	2.5298	192	3.3497
1.2728	2.5563	194	2.9371
1.154	2.5828	196	2.3541
1.5159	2.6093	198	2.5202
1.1535	2.6358	200	3.1148
1.2246	2.6623	202	3.1917
1.5538	2.6887	204	2.9176
1.3437	2.7152	206	2.9513
1.384	2.7417	208	3.0657
1.1712	2.7682	210	3.0711
1.1171	2.7947	212	2.6282
1.1222	2.8212	214	2.7070
1.0502	2.8477	216	3.1335
1.5044	2.8742	218	3.3894
1.1937	2.9007	220	2.9295
1.4499	2.9272	222	2.4740
1.1369	2.9536	224	2.6045
0.9361	2.9801	226	2.8151
0.9825	3.0	228	2.5062
1.1738	3.0265	230	2.1268
1.7623	3.0530	232	2.1633
1.2964	3.0795	234	2.5530
1.2397	3.1060	236	2.5847
1.1588	3.1325	238	2.3435
1.2689	3.1589	240	2.5115
1.2141	3.1854	242	2.6775
1.4553	3.2119	244	2.8488
1.1172	3.2384	246	2.7492
1.342	3.2649	248	2.9571
1.5291	3.2914	250	3.3224
1.3595	3.3179	252	3.7868
1.703	3.3444	254	3.9059
1.3682	3.3709	256	3.7033
1.4801	3.3974	258	3.1122
1.2444	3.4238	260	2.6195
1.0801	3.4503	262	2.4804
1.1966	3.4768	264	2.7418
1.1321	3.5033	266	2.9589
1.4521	3.5298	268	3.2319
1.5539	3.5563	270	3.7307
1.2244	3.5828	272	4.0520
1.1516	3.6093	274	3.7584
1.2933	3.6358	276	3.5864
1.3404	3.6623	278	3.7727
1.1585	3.6887	280	3.6076
1.0974	3.7152	282	3.2680
1.2817	3.7417	284	3.4060
1.2721	3.7682	286	3.7119
1.3948	3.7947	288	3.7408
1.14	3.8212	290	3.5425
1.5698	3.8477	292	3.2189
1.1783	3.8742	294	2.9808
0.9986	3.9007	296	2.8487
1.3001	3.9272	298	2.9359
1.0658	3.9536	300	2.9475
1.2286	3.9801	302	2.9914
1.7819	4.0	304	3.2627
1.2959	4.0265	306	3.5096
1.2636	4.0530	308	3.4164
1.6574	4.0795	310	3.2132
1.116	4.1060	312	3.0850
1.272	4.1325	314	3.1118
1.1456	4.1589	316	3.1339
1.4732	4.1854	318	3.2678
1.2832	4.2119	320	3.3656
1.1853	4.2384	322	3.3827
1.0865	4.2649	324	3.2203
1.1281	4.2914	326	3.0227
1.3469	4.3179	328	2.8155
1.3308	4.3444	330	2.7892
1.0484	4.3709	332	2.9204
1.279	4.3974	334	3.1109
1.165	4.4238	336	3.2519
1.2145	4.4503	338	3.2841
1.4212	4.4768	340	3.3603
1.3132	4.5033	342	3.3661
1.0477	4.5298	344	3.3598
1.6463	4.5563	346	3.4348
1.1416	4.5828	348	3.4522
1.2699	4.6093	350	3.4344
1.3752	4.6358	352	3.4206
1.4558	4.6623	354	3.4065
1.4562	4.6887	356	3.3909
0.9682	4.7152	358	3.3530
1.3652	4.7417	360	3.3158
1.2207	4.7682	362	3.2640
1.3417	4.7947	364	3.2225
0.948	4.8212	366	3.1754
1.2974	4.8477	368	3.1646
1.6318	4.8742	370	3.1947
1.5222	4.9007	372	3.2226
1.486	4.9272	374	3.2479

Framework versions

PEFT 0.14.0
Transformers 4.48.3
Pytorch 2.6.0+cu124
Datasets 3.3.2
Tokenizers 0.21.0

Sruthi-sai-2004
/

tamil-colloquial-english-translate-model

You need to agree to share your contact information to access this model

tamil-colloquial-english-translate-model

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Sruthi-sai-2004/tamil-colloquial-english-translate-model

Evaluation results