You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

mT5-Small (OPUS-100 English→Maltese)

This model is a fine-tuned version of google/mt5-small on the MLRS/OPUS-MT-EN-Fixed dataset. It achieves the following results on the test set:

  • Loss: 0.5395
  • Bleu:
    • Bleu: 0.5175
    • Brevity Penalty: 0.9870
    • Length Ratio: 0.9871
    • Translation Length: 41331
    • Reference Length: 41873
  • Chrf:
    • Score: 75.9261
    • Char Order: 6
    • Word Order: 0
    • Beta: 2
  • Gen Len: 51.54

Intended uses & limitations

The model is fine-tuned on a specific task and it should be used on the same or similar task. Any limitations present in the base model are inherited.

Training procedure

The model was fine-tuned using a customised script.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 128
  • optimizer: Use adafactor and the args are: No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Bleu Bleu Bleu Brevity Penalty Bleu Length Ratio Bleu Translation Length Bleu Reference Length Chrf Score Chrf Char Order Chrf Word Order Chrf Beta Gen Len
0.7973 1.0 7813 0.7208 0.4470 0.9888 0.9889 43489 43979 71.3941 6 0 2 54.3435
0.6534 2.0 15626 0.6406 0.4712 0.9931 0.9931 43675 43979 72.7865 6 0 2 54.1575
0.5785 3.0 23439 0.6027 0.4804 0.9937 0.9937 43703 43979 73.6939 6 0 2 54.501
0.5336 4.0 31252 0.5779 0.4900 0.9937 0.9937 43704 43979 74.1543 6 0 2 54.535
0.5034 5.0 39065 0.5617 0.4995 1.0 1.0004 43998 43979 74.7266 6 0 2 54.694
0.4797 6.0 46878 0.5501 0.4985 0.9897 0.9898 43530 43979 74.7707 6 0 2 54.3215
0.4576 7.0 54691 0.5458 0.5050 0.9921 0.9921 43632 43979 75.0066 6 0 2 54.259
0.4424 8.0 62504 0.5369 0.5062 0.9914 0.9915 43604 43979 75.0734 6 0 2 54.286
0.4287 9.0 70317 0.5358 0.5107 0.9875 0.9876 43434 43979 75.3841 6 0 2 54.1655
0.417 9.9988 78120 0.5350 0.5100 0.9868 0.9869 43404 43979 75.4033 6 0 2 54.058

Framework versions

  • Transformers 4.48.1
  • Pytorch 2.4.1+cu121
  • Datasets 3.2.0
  • Tokenizers 0.21.0

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Permissions beyond the scope of this license may be available at https://mlrs.research.um.edu.mt/.

CC BY-NC-SA 4.0

Citation

This work was first presented in MELABenchv1: Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource Maltese NLP. Cite it as follows:

@inproceedings{micallef-borg-2025-melabenchv1,
    title = "{MELAB}enchv1: Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource {M}altese {NLP}",
    author = "Micallef, Kurt  and
      Borg, Claudia",
    editor = "Che, Wanxiang  and
      Nabende, Joyce  and
      Shutova, Ekaterina  and
      Pilehvar, Mohammad Taher",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2025",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.findings-acl.1053/",
    doi = "10.18653/v1/2025.findings-acl.1053",
    pages = "20505--20527",
    ISBN = "979-8-89176-256-5",
}
Downloads last month
-
Safetensors
Model size
300M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MLRS/mt5-small_opus100-eng-mlt

Base model

google/mt5-small
Finetuned
(546)
this model

Dataset used to train MLRS/mt5-small_opus100-eng-mlt

Collection including MLRS/mt5-small_opus100-eng-mlt

Evaluation results