Italian to Ladin Real-Time Translation Model

This is a fast, lightweight real-time translation model from Italian (it) to Ladin (lld), based on Helsinki-NLP/opus-mt-itc-itc and optimized using CTranslate2 for efficient inference.

💡 Key Features

✅ Base model: Helsinki-NLP/opus-mt-itc-itc
⚡ Optimized with CTranslate2
🧠 int8 quantization for faster inference and lower memory usage
🗣️ Designed for real-time transcription + translation use cases (e.g., TransLoco)
🕒 Suitable for low-latency environments like live subtitling or in-browser translation tools

🏗️ Model Architecture

Architecture: Transformer
Format: CTranslate2
Quantization: int8
Size on disk: ~70 MB

🚀 Intended Use

Real-time speech-to-speech or speech-to-text translation from Italian to Ladin
Assistive tools for minority language accessibility
Educational and research applications
Use as part of tools like TransLoco

Non-commercial use only, in accordance with the CC BY-NC 4.0 license.

import ctranslate2
from transformers import AutoTokenizer

mtmodel = ctranslate2.Translator("./transloco-ita-lld", device="cpu")
tokenizer = AutoTokenizer.from_pretrained("./transloco-ita-lld")

texts = ["Questo è un esempio."]
tokenized_sentences = [tokenizer.convert_ids_to_tokens(tokenizer.encode(x)) for x in texts]

batch_res = mtmodel.translate_batch(source=tokenized_sentences)

decoded_results = [
     tokenizer.decode(
         tokenizer.convert_tokens_to_ids(res.hypotheses[0]),
         skip_special_tokens=True
     ) for res in batch_res
]

print(decoded_results)

⚠️ Note: The tokenizer uses fur_Latn as the target language code due to the lack of lld_Latn support in the original NLLB vocabulary.

❗Limitations

Ladin is a low-resource language, and the model may struggle with:
- Out-of-domain vocabulary
- Variant-specific variations
The model may hallucinate outputs when given incomplete or noisy input.

⚖️ Ethical Considerations

Language technologies for minority languages should be developed with community involvement.
Please avoid using the model for commercial applications or mass-translation pipelines without review.

📎 Citation

If you use this model in your work, please cite:

@misc{hallerseeber:frontull:2025,
  title     = {TransLoco: AI-driven real-time transcription, translation, and summarisation},
  subtitle  = {A self-hosted free-software conference tool},
  author    = {Simon Haller-Seeber and Samuel Frontull},
  year      = {2025},
  note      = {In preparation},
}

sfrontull
/

transloco-ita-lld