#ShweYi-17K-mt5-small

Fine-tuned google/mt5-small model for multilingual translation between English, Myanmar (Burmese), and Japanese, using the TALPCo dataset (CC BY 4.0).

##Training Info

  • Optimizer: AdamW
  • Epochs: 5
  • Batch Size: 4
  • Learning Rate: 1e-5
  • Max Length: 256

##How to Use

from transformers import AutoTokenizer, MT5ForConditionalGeneration
import torch

MAX_LENGTH = 256

tokenizer = AutoTokenizer.from_pretrained("flexavior/ShweYi-17K-mt5-small", legacy=True)
model = MT5ForConditionalGeneration.from_pretrained("flexavior/ShweYi-17K-mt5-small")

if torch.cuda.is_available():
    model = model.to("cuda")

print("Multilingual MT5 Translator is ready!")
print("Format: translate myn to jpn: အခု ဘယ်အချိန် ရှိပြီလဲ.")
print("Type 'exit' to quit.\n")

while True:
    user_input = input(">>> ")
    if user_input.strip().lower() == "exit":
        break

    input_ids = tokenizer(
        user_input,
        return_tensors="pt",
        max_length=MAX_LENGTH,
        padding="max_length",
        truncation=True
    ).input_ids

    if torch.cuda.is_available():
        input_ids = input_ids.to("cuda")

    with torch.no_grad():
        output_ids = model.generate(
            input_ids=input_ids,
            max_length=MAX_LENGTH,
            num_beams=4,
            early_stopping=True
        )

    output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
    print(f" Translation: {output_text}\n")

#Citations
If you use ShweYi-17K-mt5-small in your research, cite:

@misc{flexavior2025shweyi17kmultilingual,
  author       = {Flexavior},
  title        = {Flexavior: shweyi-17k-multilingual-en-my-ja},
  year         = {2025},
  url          = {https://huggingface.co/flexavior/ShweYi-17K-mt5-small},
  note         = {Fine-tuned mT5 for English, Myanmar, and Japanese translation. Dataset: TALPCO.}
}

And the dataset source:

@article{published_papers/22434604,
  title = {TUFS Asian Language Parallel Corpus (TALPCo)},
  author = {Hiroki Nomoto and Kenji Okano and David Moeljadi and Hideo Sawada},
  journal = {言語処理学会 第24回年次大会 発表論文集},
  pages = {436--439},
  year = {2018}
}
Downloads last month
4
Safetensors
Model size
300M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Flexavior/ShweYi-17K-mt5-small

Base model

google/mt5-small
Finetuned
(496)
this model