metadata

library_name: transformers
license: apache-2.0
base_model: Helsinki-NLP/opus-mt-en-hi
tags:
  - generated_from_trainer
model-index:
  - name: english-hindi-colloquial-translator
    results: []

english-hindi-colloquial-translator

This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-hi on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.0407

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 10
num_epochs: 3
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
6.1566	0.0327	10	2.2221
1.5545	0.0654	20	1.1247
1.0138	0.0980	30	0.9714
0.958	0.1307	40	0.8039
0.7836	0.1634	50	0.6756
0.6956	0.1961	60	0.6029
0.6507	0.2288	70	0.5687
0.5793	0.2614	80	0.4917
0.5423	0.2941	90	0.4520
0.5646	0.3268	100	0.4577
0.4474	0.3595	110	0.3724
0.4625	0.3922	120	0.3530
0.4297	0.4248	130	0.3179
0.3639	0.4575	140	0.2942
0.3343	0.4902	150	0.2919
0.3488	0.5229	160	0.2786
0.3062	0.5556	170	0.2330
0.3013	0.5882	180	0.2675
0.2586	0.6209	190	0.2246
0.2686	0.6536	200	0.2242
0.2456	0.6863	210	0.2115
0.2897	0.7190	220	0.2206
0.2275	0.7516	230	0.1907
0.2274	0.7843	240	0.1813
0.2204	0.8170	250	0.1782
0.2318	0.8497	260	0.1725
0.2457	0.8824	270	0.1565
0.1658	0.9150	280	0.1936
0.208	0.9477	290	0.1608
0.189	0.9804	300	0.1528
0.1681	1.0131	310	0.1223
0.1592	1.0458	320	0.1407
0.1577	1.0784	330	0.1403
0.1642	1.1111	340	0.1355
0.1289	1.1438	350	0.1328
0.1285	1.1765	360	0.1291
0.1155	1.2092	370	0.1205
0.0995	1.2418	380	0.1124
0.1283	1.2745	390	0.1040
0.107	1.3072	400	0.1126
0.0981	1.3399	410	0.1128
0.0881	1.3725	420	0.1017
0.1188	1.4052	430	0.1054
0.1063	1.4379	440	0.1044
0.0812	1.4706	450	0.1032
0.0894	1.5033	460	0.0978
0.11	1.5359	470	0.0939
0.1104	1.5686	480	0.0946
0.0805	1.6013	490	0.0837
0.0993	1.6340	500	0.0848
0.0604	1.6667	510	0.0841
0.0625	1.6993	520	0.0823
0.0929	1.7320	530	0.0820
0.0676	1.7647	540	0.0910
0.0754	1.7974	550	0.0793
0.0707	1.8301	560	0.0755
0.0919	1.8627	570	0.0700
0.0583	1.8954	580	0.0684
0.0688	1.9281	590	0.0665
0.0378	1.9608	600	0.0680
0.0724	1.9935	610	0.0690
0.0609	2.0261	620	0.0695
0.036	2.0588	630	0.0640
0.0504	2.0915	640	0.0611
0.0514	2.1242	650	0.0608
0.0411	2.1569	660	0.0606
0.0472	2.1895	670	0.0592
0.0514	2.2222	680	0.0577
0.0526	2.2549	690	0.0587
0.0429	2.2876	700	0.0563
0.0321	2.3203	710	0.0526
0.0319	2.3529	720	0.0514
0.037	2.3856	730	0.0519
0.0296	2.4183	740	0.0516
0.023	2.4510	750	0.0498
0.0184	2.4837	760	0.0512
0.021	2.5163	770	0.0514
0.0154	2.5490	780	0.0573
0.0381	2.5817	790	0.0506
0.0205	2.6144	800	0.0467
0.0214	2.6471	810	0.0453
0.0216	2.6797	820	0.0441
0.024	2.7124	830	0.0438
0.0317	2.7451	840	0.0439
0.0181	2.7778	850	0.0430
0.0227	2.8105	860	0.0424
0.02	2.8431	870	0.0417
0.0092	2.8758	880	0.0415
0.0228	2.9085	890	0.0410
0.0151	2.9412	900	0.0408
0.0208	2.9739	910	0.0407

Framework versions

Transformers 4.48.3
Pytorch 2.6.0+cu124
Datasets 3.3.1
Tokenizers 0.21.0