English β†’ Bangla Neural Machine Translation (NMT) - en-bn-nmt

This model is a fine-tuned version of shhossain/opus-mt-en-to-bn.
It has been trained on a combination of open and custom datasets to enhance English to Bengali (Bangla) translation accuracy, especially in conversational, motivational, and real-life sentence structures.


🧠 Model Description

The en-bn-nmt model is a neural machine translation model fine-tuned using the MarianMT architecture from Hugging Face.
The model's main goal is to translate from English to Bangla in an expressive and human-like way, preserving emotional context and natural tone.


πŸ“¦ Datasets Used

Training was done on a diverse set of datasets, including:

This variety of sources was chosen to ensure the model can perform well in formal, informal, and spoken-language use cases.


🎯 Intended Uses & Limitations

Intended for:

  • Language learners and educators
  • Students and researchers
  • App developers building English to Bengali tools

Not suitable for:

  • Legal, medical, or sensitive document translation
  • High-stakes production use without human review

βš™οΈ Training Details

  • Platform: Google Colab
  • GPU: T4 (via Google Colab Pro)
  • Disk Usage: ~112 GB

πŸ“ˆ Training Results

Training Loss Epoch Validation Loss BLEU Score
0.2006 1 0.191746 24.8904
0.1334 2 0.164062 29.9686
0.0970 3 0.154388 33.1481
  • Fine‑Tuned Model BLEU: 33.14806

πŸ“Š Evaluation Metrics

  • Final BLEU Score: 33.14806
  • Final Validation Loss: 0.15439
  • Final Training Loss: 0.09700

πŸ§ͺ Sample Output Evaluation

We evaluated the model using hand-crafted sentence sets based on motivational, real-life, and spoken content. While BLEU scores are useful, we also performed manual assessments to compare fluency and correctness against reference translations.

βœ… Sample Input β†’ Output Example

English:

"Don't wait for opportunity. Create it."

Model Output:

"সুযোগের জন্য ΰ¦…ΰ¦ͺেক্ষা করো না, এটা ঀৈরি কর"

Reference:

"সুযোগের ΰ¦…ΰ¦ͺেক্ষা করো নাΰ₯€ ΰ¦¨ΰ¦Ώΰ¦œΰ§‡ΰ¦•ΰ§‡ ঀৈরি করোΰ₯€"

(Model retains intent but slightly deviates in tone. Under refinement.)


πŸ“š Usage Example

from transformers import MarianMTModel, MarianTokenizer

model_name = "monirbishal/en-bn-nmt"
tokenizer = MarianTokenizer.from_pretrained(model_name)
model = MarianMTModel.from_pretrained(model_name)

def translate(text):
    inputs = tokenizer(text, return_tensors="pt", padding=True)
    translated = model.generate(**inputs)
    return tokenizer.decode(translated[0], skip_special_tokens=True)

translate("I have to go to sleep.")

πŸ”¬ Future Improvements

  • Improve idiomatic and conversational fluency
  • Add reverse translation support (Bn β†’ En)
  • Include more complex sentence structures (narratives, dialogues, questions)

πŸ“˜ License

This model is currently open for educational and research use.
We are working on assigning an appropriate license (likely Apache 2.0 or MIT).
If you use this model, please cite the original authors and datasets.


πŸ™Œ Acknowledgements


βœ… Open Source Notice

This project is released to the open-source community to promote better Bangla language technology for educational and real-world applications.
We welcome feedback, collaboration, and contributions.


πŸ“– Citation

If you use this model, please cite it:

@misc{monirbishal_en_bn_nmt,
  title        = {English to Bengali Neural Machine Translation Model},
  author       = {Monir Bishal},
  howpublished = {\url{https://huggingface.co/monirbishal/en-bn-nmt}},
  year         = {2025}
}
Downloads last month
72
Safetensors
Model size
75.8M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for monirbishal/en-bn-nmt

Finetuned
(1)
this model

Dataset used to train monirbishal/en-bn-nmt