--- license: mit tags: - translation - pytorch - encoder-decoder - transformer - english-to-hindi - nmt library_name: pytorch language: - en - hi --- # TransformerNMT: English-to-Hindi Experimental Transformer Model This repository contains a **Transformer Encoder-Decoder** model implemented from scratch in PyTorch for English-to-Hindi neural machine translation. The model and all training, preprocessing, and inference scripts are custom and do **not** use Hugging Face Transformers, but follow the original "Attention is All You Need" architecture. ## Model Details - **Architecture:** Transformer Encoder-Decoder (Vaswani et al., 2017) - **Framework:** PyTorch - **Languages:** English (source) → Hindi (target) - **Vocabulary:** 32,000 BPE tokens per language (trained with `tokenizers`) - **Training Data:** Parallel English-Hindi corpus (see repo for data details) - **Intended Use:** Research, experimentation, and educational purposes ## Training - Trained from scratch using the scripts in this repository. - Supports distributed and mixed-precision training. - Checkpoints and tokenizer files are provided in the `models/` and `Data/bi_tokenizers_32k/` directories. ## Intended Uses & Limitations - **Intended for:** Experimentation, research, and demonstration of custom Transformer implementations. - **Not intended for:** Production use or high-stakes applications. - **Limitations:** May not achieve state-of-the-art translation quality. Use with caution for real-world tasks. ## Example Inference Below is a simple inference script to translate English text to Hindi using the trained model and tokenizer: ```python import torch from tokenizer import BilingualTokenizer as Tokenizer from model import Transformer, TransformerConfig from translator import TranslationInference # 1. Load config and checkpoint config = TransformerConfig(shared_embeddings=True) checkpoint = torch.load('models/TNMT_v1_Beta_single.pt', map_location='cpu') # 2. Build model and load weights model = Transformer(config) model.load_state_dict(checkpoint['model_state_dict']) model = model.to('cpu') # 3. Load tokenizer tokenizer = Tokenizer(vocab_size=32000) tokenizer_loaded = tokenizer.load_tokenizers('bi_tokenizers_32k') # 4. Create inference helper translator = TranslationInference( model=model, tokenizer=tokenizer_loaded, device='cpu' ) # 5. Translate source_text = "This is a test sentence." translated_text = translator.translate_text(source_text) print("Translated text:", translated_text) ``` ## Citation If you use this code or model, please cite: > Vaswani et al., "Attention is All You Need", NeurIPS 2017. --- **Author:** QuarkML **License:** MIT