Italian to Ladin Real-Time Translation Model

This is a fast, lightweight real-time translation model from Italian (it) to Ladin (lld), based on Helsinki-NLP/opus-mt-itc-itc and optimized using CTranslate2 for efficient inference.

💡 Key Features

  • Base model: Helsinki-NLP/opus-mt-itc-itc
  • Optimized with CTranslate2
  • 🧠 int8 quantization for faster inference and lower memory usage
  • 🗣️ Designed for real-time transcription + translation use cases (e.g., TransLoco)
  • 🕒 Suitable for low-latency environments like live subtitling or in-browser translation tools

🏗️ Model Architecture

  • Architecture: Transformer
  • Format: CTranslate2
  • Quantization: int8
  • Size on disk: ~70 MB

🚀 Intended Use

  • Real-time speech-to-speech or speech-to-text translation from Italian to Ladin
  • Assistive tools for minority language accessibility
  • Educational and research applications
  • Use as part of tools like TransLoco

Non-commercial use only, in accordance with the CC BY-NC 4.0 license.

import ctranslate2
from transformers import AutoTokenizer

mtmodel = ctranslate2.Translator("./transloco-ita-lld", device="cpu")
tokenizer = AutoTokenizer.from_pretrained("./transloco-ita-lld")

texts = ["Questo è un esempio."]
tokenized_sentences = [tokenizer.convert_ids_to_tokens(tokenizer.encode(x)) for x in texts]

batch_res = mtmodel.translate_batch(source=tokenized_sentences)

decoded_results = [
     tokenizer.decode(
         tokenizer.convert_tokens_to_ids(res.hypotheses[0]),
         skip_special_tokens=True
     ) for res in batch_res
]

print(decoded_results)

⚠️ Note: The tokenizer uses fur_Latn as the target language code due to the lack of lld_Latn support in the original NLLB vocabulary.

❗Limitations

  • Ladin is a low-resource language, and the model may struggle with:
    • Out-of-domain vocabulary
    • Variant-specific variations
  • The model may hallucinate outputs when given incomplete or noisy input.

⚖️ Ethical Considerations

  • Language technologies for minority languages should be developed with community involvement.
  • Please avoid using the model for commercial applications or mass-translation pipelines without review.

📎 Citation

If you use this model in your work, please cite:

@misc{hallerseeber:frontull:2025,
  title     = {TransLoco: AI-driven real-time transcription, translation, and summarisation},
  subtitle  = {A self-hosted free-software conference tool},
  author    = {Simon Haller-Seeber and Samuel Frontull},
  year      = {2025},
  note      = {In preparation},
}
Downloads last month
269
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sfrontull/transloco-ita-lld

Finetuned
(1)
this model