---
license: mit
tags:
  - translation
  - pytorch
  - encoder-decoder
  - transformer
  - english-to-hindi
  - nmt
library_name: pytorch
language:
  - en
  - hi
---

# TransformerNMT: English-to-Hindi Experimental Transformer Model

This repository contains a **Transformer Encoder-Decoder** model implemented from scratch in PyTorch for English-to-Hindi neural machine translation. The model and all training, preprocessing, and inference scripts are custom and do **not** use Hugging Face Transformers, but follow the original "Attention is All You Need" architecture.

## Model Details

- **Architecture:** Transformer Encoder-Decoder (Vaswani et al., 2017)
- **Framework:** PyTorch
- **Languages:** English (source) → Hindi (target)
- **Vocabulary:** 32,000 BPE tokens per language (trained with `tokenizers`)
- **Training Data:** Parallel English-Hindi corpus (see repo for data details)
- **Intended Use:** Research, experimentation, and educational purposes

## Training

- Trained from scratch using the scripts in this repository.
- Supports distributed and mixed-precision training.
- Checkpoints and tokenizer files are provided in the `models/` and `Data/bi_tokenizers_32k/` directories.

## Intended Uses & Limitations

- **Intended for:** Experimentation, research, and demonstration of custom Transformer implementations.
- **Not intended for:** Production use or high-stakes applications.
- **Limitations:** May not achieve state-of-the-art translation quality. Use with caution for real-world tasks.

## Example Inference

Below is a simple inference script to translate English text to Hindi using the trained model and tokenizer:

```python
import torch
from tokenizer import BilingualTokenizer as Tokenizer
from model import Transformer, TransformerConfig
from translator import TranslationInference

# 1. Load config and checkpoint
config = TransformerConfig(shared_embeddings=True)
checkpoint = torch.load('models/TNMT_v1_Beta_single.pt', map_location='cpu')

# 2. Build model and load weights
model = Transformer(config)
model.load_state_dict(checkpoint['model_state_dict'])
model = model.to('cpu')

# 3. Load tokenizer
tokenizer = Tokenizer(vocab_size=32000)
tokenizer_loaded = tokenizer.load_tokenizers('bi_tokenizers_32k')

# 4. Create inference helper
translator = TranslationInference(
    model=model,
    tokenizer=tokenizer_loaded,
    device='cpu'
)

# 5. Translate
source_text = "This is a test sentence."
translated_text = translator.translate_text(source_text)
print("Translated text:", translated_text)
```

## Citation

If you use this code or model, please cite:

> Vaswani et al., "Attention is All You Need", NeurIPS 2017.

---

**Author:** QuarkML
**License:** MIT