--- license: cc-by-4.0 datasets: - matbahasa/TALPCo language: - my - en - ja base_model: - google/mt5-small tags: - translation - seq2seq - fine-tuned - text-to-text training_info: optimizer: AdamW epochs: 5 batch_size: 4 learning_rate: 1e-5 max_length: 256 --- #ShweYi-17K-mt5-small Fine-tuned [`google/mt5-small`](https://huggingface.co/google/mt5-small) model for multilingual translation between **English**, **Myanmar (Burmese)**, and **Japanese**, using the [TALPCo dataset](https://github.com/matbahasa/TALPCo) (CC BY 4.0). ##Training Info - Optimizer: AdamW - Epochs: 5 - Batch Size: 4 - Learning Rate: 1e-5 - Max Length: 256 ##How to Use ```python from transformers import AutoTokenizer, MT5ForConditionalGeneration import torch MAX_LENGTH = 256 tokenizer = AutoTokenizer.from_pretrained("flexavior/ShweYi-17K-mt5-small", legacy=True) model = MT5ForConditionalGeneration.from_pretrained("flexavior/ShweYi-17K-mt5-small") if torch.cuda.is_available(): model = model.to("cuda") print("Multilingual MT5 Translator is ready!") print("Format: translate myn to jpn: အခု ဘယ်အချိန် ရှိပြီလဲ.") print("Type 'exit' to quit.\n") while True: user_input = input(">>> ") if user_input.strip().lower() == "exit": break input_ids = tokenizer( user_input, return_tensors="pt", max_length=MAX_LENGTH, padding="max_length", truncation=True ).input_ids if torch.cuda.is_available(): input_ids = input_ids.to("cuda") with torch.no_grad(): output_ids = model.generate( input_ids=input_ids, max_length=MAX_LENGTH, num_beams=4, early_stopping=True ) output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True) print(f" Translation: {output_text}\n") #Citations If you use ShweYi-17K-mt5-small in your research, cite: @misc{flexavior2025shweyi17kmultilingual, author = {Flexavior}, title = {Flexavior: shweyi-17k-multilingual-en-my-ja}, year = {2025}, url = {https://huggingface.co/flexavior/ShweYi-17K-mt5-small}, note = {Fine-tuned mT5 for English, Myanmar, and Japanese translation. Dataset: TALPCO.} } And the dataset source: @article{published_papers/22434604, title = {TUFS Asian Language Parallel Corpus (TALPCo)}, author = {Hiroki Nomoto and Kenji Okano and David Moeljadi and Hideo Sawada}, journal = {言語処理学会 第24回年次大会 発表論文集}, pages = {436--439}, year = {2018} }