nllb_complete

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.8285
  • Bleu: 17.1412
  • Gen Len: 17.896

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 64
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 5000
  • num_epochs: 24.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
2.1296 1.4834 10000 2.0709 9.9056 20.1323
2.0253 2.9668 20000 1.9697 11.7423 19.27
1.8771 4.4503 30000 1.9199 13.3983 18.9643
1.7891 5.9338 40000 1.8851 14.1016 18.3833
1.7159 7.4173 50000 1.8680 14.8584 18.2797
1.6594 8.9007 60000 1.8473 15.8809 18.3863
1.6609 10.3842 70000 1.8406 15.8588 18.159
1.6358 11.8676 80000 1.8319 16.4395 18.4773
1.5623 13.3511 90000 1.8298 16.8956 18.3217
1.5534 14.8345 100000 1.8218 16.8725 18.5327
1.498 16.3180 110000 1.8286 16.6418 17.9697
1.4663 17.8014 120000 1.8252 17.2847 17.9357
1.4309 19.2849 130000 1.8299 17.027 17.7263
1.4398 20.7684 140000 1.8270 17.0189 18.1353
1.4534 22.2519 150000 1.8292 17.04 17.9637
1.4441 23.7353 160000 1.8285 17.1412 17.896

Framework versions

  • Transformers 4.50.3
  • Pytorch 2.7.0+cu126
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
20
Safetensors
Model size
615M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Leonel-Maia/nllb_complete

Finetuned
(163)
this model