--- library_name: transformers license: apache-2.0 base_model: Helsinki-NLP/opus-mt-en-hi tags: - generated_from_trainer model-index: - name: english-hindi-colloquial-translator results: [] --- # english-hindi-colloquial-translator This model is a fine-tuned version of [Helsinki-NLP/opus-mt-en-hi](https://huggingface.co/Helsinki-NLP/opus-mt-en-hi) on the None dataset. It achieves the following results on the evaluation set: - Loss: 0.0407 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0005 - train_batch_size: 8 - eval_batch_size: 8 - seed: 42 - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 10 - num_epochs: 3 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:------:|:----:|:---------------:| | 6.1566 | 0.0327 | 10 | 2.2221 | | 1.5545 | 0.0654 | 20 | 1.1247 | | 1.0138 | 0.0980 | 30 | 0.9714 | | 0.958 | 0.1307 | 40 | 0.8039 | | 0.7836 | 0.1634 | 50 | 0.6756 | | 0.6956 | 0.1961 | 60 | 0.6029 | | 0.6507 | 0.2288 | 70 | 0.5687 | | 0.5793 | 0.2614 | 80 | 0.4917 | | 0.5423 | 0.2941 | 90 | 0.4520 | | 0.5646 | 0.3268 | 100 | 0.4577 | | 0.4474 | 0.3595 | 110 | 0.3724 | | 0.4625 | 0.3922 | 120 | 0.3530 | | 0.4297 | 0.4248 | 130 | 0.3179 | | 0.3639 | 0.4575 | 140 | 0.2942 | | 0.3343 | 0.4902 | 150 | 0.2919 | | 0.3488 | 0.5229 | 160 | 0.2786 | | 0.3062 | 0.5556 | 170 | 0.2330 | | 0.3013 | 0.5882 | 180 | 0.2675 | | 0.2586 | 0.6209 | 190 | 0.2246 | | 0.2686 | 0.6536 | 200 | 0.2242 | | 0.2456 | 0.6863 | 210 | 0.2115 | | 0.2897 | 0.7190 | 220 | 0.2206 | | 0.2275 | 0.7516 | 230 | 0.1907 | | 0.2274 | 0.7843 | 240 | 0.1813 | | 0.2204 | 0.8170 | 250 | 0.1782 | | 0.2318 | 0.8497 | 260 | 0.1725 | | 0.2457 | 0.8824 | 270 | 0.1565 | | 0.1658 | 0.9150 | 280 | 0.1936 | | 0.208 | 0.9477 | 290 | 0.1608 | | 0.189 | 0.9804 | 300 | 0.1528 | | 0.1681 | 1.0131 | 310 | 0.1223 | | 0.1592 | 1.0458 | 320 | 0.1407 | | 0.1577 | 1.0784 | 330 | 0.1403 | | 0.1642 | 1.1111 | 340 | 0.1355 | | 0.1289 | 1.1438 | 350 | 0.1328 | | 0.1285 | 1.1765 | 360 | 0.1291 | | 0.1155 | 1.2092 | 370 | 0.1205 | | 0.0995 | 1.2418 | 380 | 0.1124 | | 0.1283 | 1.2745 | 390 | 0.1040 | | 0.107 | 1.3072 | 400 | 0.1126 | | 0.0981 | 1.3399 | 410 | 0.1128 | | 0.0881 | 1.3725 | 420 | 0.1017 | | 0.1188 | 1.4052 | 430 | 0.1054 | | 0.1063 | 1.4379 | 440 | 0.1044 | | 0.0812 | 1.4706 | 450 | 0.1032 | | 0.0894 | 1.5033 | 460 | 0.0978 | | 0.11 | 1.5359 | 470 | 0.0939 | | 0.1104 | 1.5686 | 480 | 0.0946 | | 0.0805 | 1.6013 | 490 | 0.0837 | | 0.0993 | 1.6340 | 500 | 0.0848 | | 0.0604 | 1.6667 | 510 | 0.0841 | | 0.0625 | 1.6993 | 520 | 0.0823 | | 0.0929 | 1.7320 | 530 | 0.0820 | | 0.0676 | 1.7647 | 540 | 0.0910 | | 0.0754 | 1.7974 | 550 | 0.0793 | | 0.0707 | 1.8301 | 560 | 0.0755 | | 0.0919 | 1.8627 | 570 | 0.0700 | | 0.0583 | 1.8954 | 580 | 0.0684 | | 0.0688 | 1.9281 | 590 | 0.0665 | | 0.0378 | 1.9608 | 600 | 0.0680 | | 0.0724 | 1.9935 | 610 | 0.0690 | | 0.0609 | 2.0261 | 620 | 0.0695 | | 0.036 | 2.0588 | 630 | 0.0640 | | 0.0504 | 2.0915 | 640 | 0.0611 | | 0.0514 | 2.1242 | 650 | 0.0608 | | 0.0411 | 2.1569 | 660 | 0.0606 | | 0.0472 | 2.1895 | 670 | 0.0592 | | 0.0514 | 2.2222 | 680 | 0.0577 | | 0.0526 | 2.2549 | 690 | 0.0587 | | 0.0429 | 2.2876 | 700 | 0.0563 | | 0.0321 | 2.3203 | 710 | 0.0526 | | 0.0319 | 2.3529 | 720 | 0.0514 | | 0.037 | 2.3856 | 730 | 0.0519 | | 0.0296 | 2.4183 | 740 | 0.0516 | | 0.023 | 2.4510 | 750 | 0.0498 | | 0.0184 | 2.4837 | 760 | 0.0512 | | 0.021 | 2.5163 | 770 | 0.0514 | | 0.0154 | 2.5490 | 780 | 0.0573 | | 0.0381 | 2.5817 | 790 | 0.0506 | | 0.0205 | 2.6144 | 800 | 0.0467 | | 0.0214 | 2.6471 | 810 | 0.0453 | | 0.0216 | 2.6797 | 820 | 0.0441 | | 0.024 | 2.7124 | 830 | 0.0438 | | 0.0317 | 2.7451 | 840 | 0.0439 | | 0.0181 | 2.7778 | 850 | 0.0430 | | 0.0227 | 2.8105 | 860 | 0.0424 | | 0.02 | 2.8431 | 870 | 0.0417 | | 0.0092 | 2.8758 | 880 | 0.0415 | | 0.0228 | 2.9085 | 890 | 0.0410 | | 0.0151 | 2.9412 | 900 | 0.0408 | | 0.0208 | 2.9739 | 910 | 0.0407 | ### Framework versions - Transformers 4.48.3 - Pytorch 2.6.0+cu124 - Datasets 3.3.1 - Tokenizers 0.21.0