druk-ai-20250628_0745

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.2690

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 3
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
4.6718 0.0684 50 3.9966
3.3909 0.1367 100 3.2014
3.2175 0.2051 150 2.9763
3.1066 0.2734 200 2.9230
3.058 0.3418 250 2.8082
2.9733 0.4101 300 2.7560
2.9797 0.4785 350 2.7420
2.714 0.5468 400 2.6686
2.8964 0.6152 450 2.6501
2.7973 0.6835 500 2.6197
2.7552 0.7519 550 2.5710
2.7453 0.8202 600 2.5410
2.9687 0.8886 650 2.5268
2.7995 0.9569 700 2.5237
2.5497 1.0253 750 2.5099
2.6585 1.0936 800 2.4769
2.7442 1.1620 850 2.4660
2.7224 1.2303 900 2.4511
2.704 1.2987 950 2.4375
2.5466 1.3671 1000 2.4223
2.3552 1.4354 1050 2.4044
2.6877 1.5038 1100 2.4021
2.2772 1.5721 1150 2.3974
2.5707 1.6405 1200 2.3753
2.5388 1.7088 1250 2.3624
2.4451 1.7772 1300 2.3741
2.6623 1.8455 1350 2.3595
2.2503 1.9139 1400 2.3445
2.4205 1.9822 1450 2.3315
2.2562 2.0506 1500 2.3277
2.2127 2.1189 1550 2.3287
2.4043 2.1873 1600 2.3091
2.3461 2.2556 1650 2.3168
2.5133 2.3240 1700 2.2984
2.4444 2.3923 1750 2.2961
2.3056 2.4607 1800 2.2970
2.4537 2.5290 1850 2.2844
2.3241 2.5974 1900 2.2835
2.2608 2.6658 1950 2.2756
2.3779 2.7341 2000 2.2758
2.3757 2.8025 2050 2.2691
2.2582 2.8708 2100 2.2710
2.3975 2.9392 2150 2.2690

Framework versions

  • PEFT 0.13.2
  • Transformers 4.45.0
  • Pytorch 2.6.0+cu124
  • Datasets 2.21.0
  • Tokenizers 0.20.3
Downloads last month
25
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for achisingh06/druk-ai-20250628_0745

Adapter
(25)
this model