NLLB-200 Distilled 600M — Hindi → Kangri (v2)
This is a fine-tuned version of facebook/nllb-200-distilled-600M for Hindi to Kangri translation.
This model was trained on a curated parallel corpus of 49k Hindi-Kangri sentence pairs, with additional vocabulary and tokenizer extensions for kang_Deva
support.
Model Details
- Model Architecture: Transformer (Encoder-Decoder)
- Base:
facebook/nllb-200-distilled-600M
- Languages:
- Source:
hin_Deva
(Hindi) - Target:
kang_Deva
(Kangri in Devanagari script)
- Source:
- Tokenizer: SentencePiece with extended vocabulary for
kang_Deva
- Direction Supported:
Hindi → Kangri
only (unidirectional)
How to Use
import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline
model_name = "cloghost/nllb-200-distilled-600M-hin-kang-v2"
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
device = 0 if torch.cuda.is_available() else -1
translator = pipeline(
"translation",
model=model,
tokenizer=tokenizer,
src_lang="hin_Deva",
tgt_lang="kang_Deva",
device=device
)
text = """मगर हिमाचली भाषा तो पहले से बोली जा रही है।
लोग सदियों से ही इसके संग जी रहे हैं।
पहाड़ी भाषा का इतिहास हिन्दी साहित्य के आदिकाल ,जिसे सिद्ध चारण काल के नाम से भी जानते हैं
"""
translation = translator(text)
Benchmark Scores
Evaluated on a clean 5k sample held-out test set:
Metric | Value |
---|---|
BLEU | 26.03 |
BLEU-4 | 14.11 |
ROUGE-1 | 4.73% |
ROUGE-L | 4.76% |
METEOR | 43.63% |
BERTScore-F1 | 93.39% |
BERT Precision | 93.42% |
BERT Recall | 93.37% |
ChrF | 53.93 |
TER (↓ is better) | 56.96 |
Empty Predictions | 0 |
- Downloads last month
- 40
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
1
Ask for provider support
Model tree for cloghost/nllb-200-distilled-600M-hin-kang-v2
Base model
facebook/nllb-200-distilled-600MSpace using cloghost/nllb-200-distilled-600M-hin-kang-v2 1
Evaluation results
- BLEU on Custom Hindi–Kangri Parallel Corpusself-reported26.030
- ChrF on Custom Hindi–Kangri Parallel Corpusself-reported53.930
- BERTScore-F1 on Custom Hindi–Kangri Parallel Corpusself-reported0.934