La1ya's picture
Model save
cafd646 verified
|
raw
history blame
6.2 kB
metadata
library_name: transformers
license: apache-2.0
base_model: Helsinki-NLP/opus-mt-en-hi
tags:
  - generated_from_trainer
model-index:
  - name: english-hindi-colloquial-translator
    results: []

english-hindi-colloquial-translator

This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-hi on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0407

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 10
  • num_epochs: 3
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
6.1566 0.0327 10 2.2221
1.5545 0.0654 20 1.1247
1.0138 0.0980 30 0.9714
0.958 0.1307 40 0.8039
0.7836 0.1634 50 0.6756
0.6956 0.1961 60 0.6029
0.6507 0.2288 70 0.5687
0.5793 0.2614 80 0.4917
0.5423 0.2941 90 0.4520
0.5646 0.3268 100 0.4577
0.4474 0.3595 110 0.3724
0.4625 0.3922 120 0.3530
0.4297 0.4248 130 0.3179
0.3639 0.4575 140 0.2942
0.3343 0.4902 150 0.2919
0.3488 0.5229 160 0.2786
0.3062 0.5556 170 0.2330
0.3013 0.5882 180 0.2675
0.2586 0.6209 190 0.2246
0.2686 0.6536 200 0.2242
0.2456 0.6863 210 0.2115
0.2897 0.7190 220 0.2206
0.2275 0.7516 230 0.1907
0.2274 0.7843 240 0.1813
0.2204 0.8170 250 0.1782
0.2318 0.8497 260 0.1725
0.2457 0.8824 270 0.1565
0.1658 0.9150 280 0.1936
0.208 0.9477 290 0.1608
0.189 0.9804 300 0.1528
0.1681 1.0131 310 0.1223
0.1592 1.0458 320 0.1407
0.1577 1.0784 330 0.1403
0.1642 1.1111 340 0.1355
0.1289 1.1438 350 0.1328
0.1285 1.1765 360 0.1291
0.1155 1.2092 370 0.1205
0.0995 1.2418 380 0.1124
0.1283 1.2745 390 0.1040
0.107 1.3072 400 0.1126
0.0981 1.3399 410 0.1128
0.0881 1.3725 420 0.1017
0.1188 1.4052 430 0.1054
0.1063 1.4379 440 0.1044
0.0812 1.4706 450 0.1032
0.0894 1.5033 460 0.0978
0.11 1.5359 470 0.0939
0.1104 1.5686 480 0.0946
0.0805 1.6013 490 0.0837
0.0993 1.6340 500 0.0848
0.0604 1.6667 510 0.0841
0.0625 1.6993 520 0.0823
0.0929 1.7320 530 0.0820
0.0676 1.7647 540 0.0910
0.0754 1.7974 550 0.0793
0.0707 1.8301 560 0.0755
0.0919 1.8627 570 0.0700
0.0583 1.8954 580 0.0684
0.0688 1.9281 590 0.0665
0.0378 1.9608 600 0.0680
0.0724 1.9935 610 0.0690
0.0609 2.0261 620 0.0695
0.036 2.0588 630 0.0640
0.0504 2.0915 640 0.0611
0.0514 2.1242 650 0.0608
0.0411 2.1569 660 0.0606
0.0472 2.1895 670 0.0592
0.0514 2.2222 680 0.0577
0.0526 2.2549 690 0.0587
0.0429 2.2876 700 0.0563
0.0321 2.3203 710 0.0526
0.0319 2.3529 720 0.0514
0.037 2.3856 730 0.0519
0.0296 2.4183 740 0.0516
0.023 2.4510 750 0.0498
0.0184 2.4837 760 0.0512
0.021 2.5163 770 0.0514
0.0154 2.5490 780 0.0573
0.0381 2.5817 790 0.0506
0.0205 2.6144 800 0.0467
0.0214 2.6471 810 0.0453
0.0216 2.6797 820 0.0441
0.024 2.7124 830 0.0438
0.0317 2.7451 840 0.0439
0.0181 2.7778 850 0.0430
0.0227 2.8105 860 0.0424
0.02 2.8431 870 0.0417
0.0092 2.8758 880 0.0415
0.0228 2.9085 890 0.0410
0.0151 2.9412 900 0.0408
0.0208 2.9739 910 0.0407

Framework versions

  • Transformers 4.48.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.1
  • Tokenizers 0.21.0