MarianMT Indonesian-English Translation (Fine-tuned)
This model is a fine-tuned version of Helsinki-NLP/opus-mt-id-en for Indonesian to English translation.
Model Details
- Base Model: Helsinki-NLP/opus-mt-id-en
- Fine-tuned on: TED Talks parallel corpus (Indonesian-English)
- Training Date: 2025-05-25
- Languages: Indonesian (id) โ English (en)
- License: Apache 2.0
Training Configuration
- Training Framework: PyTorch + Transformers
- Training Data: TED Talks parallel corpus
- Dataset Usage: 100% of full dataset
- Training Parameters:
- Learning Rate: 3e-5
- Batch Size: 4/2 (GPU/CPU)
- Max Length: 128 tokens
- Epochs: 10
Usage
from transformers import MarianMTModel, MarianTokenizer
# Load model and tokenizer
tokenizer = MarianTokenizer.from_pretrained("dhintech/marian-id-en-lg")
model = MarianMTModel.from_pretrained("dhintech/marian-id-en-lg")
# Translate Indonesian to English
def translate(text):
inputs = tokenizer(text, return_tensors="pt", padding=True)
outputs = model.generate(**inputs, max_length=128, num_beams=4)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
# Example usage
indonesian_text = "Selamat pagi, terima kasih sudah datang."
english_translation = translate(indonesian_text)
print(english_translation)
Example Translations
Indonesian | English |
---|---|
Selamat pagi, terima kasih sudah datang. | Good morning, thank you for coming. |
Teknologi AI berkembang sangat pesat. | AI technology is developing very rapidly. |
Mari kita diskusikan hasil penelitian ini. | Let's discuss the results of this research. |
Performance
- Optimized for conversational and presentation-style text
- Best performance on formal Indonesian text
- Model size: approximately 300MB
- Suitable for mobile deployment
Citation
@misc{marian-id-en-lg,
title={MarianMT Indonesian-English Translation (Fine-tuned)},
author={DhinTech},
year={2025},
publisher={Hugging Face},
journal={Hugging Face Model Hub},
howpublished={\url{https://huggingface.co/dhintech/marian-id-en-lg}}
}
- Downloads last month
- 20
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for dhintech/marian-id-en-lg
Base model
Helsinki-NLP/opus-mt-id-en