Qwen3-1.7B-Tamil-16bit-Instruct
Model Description
This is a fine-tuned version of Qwen3-1.7B specifically optimized for Tamil language tasks. The model has been trained to understand and generate Tamil text across various domains including coding, entertainment, question-answering, reasoning, literature, ethics, and translation.
- Developed by: sabaridsnfuji
- Model type: Causal Language Model
- Language: Tamil
- License: Apache 2.0
- Base model: Qwen3-1.7B
- Parameter count: 1.7B
- Precision: 16-bit
Training Details
This Qwen3 model was trained 2x faster with Unsloth and Hugging Face's TRL library.
Training Dataset
- Dataset: abhinand/tamil-alpaca-orca
- Description: A comprehensive Tamil instruction-following dataset based on Alpaca and Orca methodologies
Evaluation
Evaluation Dataset
- Dataset: abhinand/tamil-llama-eval
- Evaluation Date: 2025-07-20
- Total Samples: 466
Overall Performance Metrics
Metric | Score | Standard Deviation |
---|---|---|
Overall Quality | 0.704 | 0.032 |
Fluency | 0.914 | 0.023 |
Relevance | 0.565 | 0.078 |
Coherence | 0.371 | 0.061 |
Completeness | 0.750 | 0.039 |
Safety Score | 0.984 | 0.009 |
Hallucination Risk | 0.002 | 0.004 |
Perplexity | 174.942 | 904.409 |
Category-wise Performance
Category | Samples | Overall Quality | Fluency | Relevance | Safety |
---|---|---|---|---|---|
Entertainment | 50 | 0.749 | 0.911 | 0.711 | 0.974 |
Reasoning | 50 | 0.740 | 0.920 | 0.574 | 0.968 |
Open QA | 50 | 0.722 | 0.933 | 0.656 | 0.984 |
Literature | 50 | 0.718 | 0.921 | 0.597 | 0.992 |
QA | 50 | 0.711 | 0.909 | 0.556 | 0.980 |
Ethics | 50 | 0.700 | 0.921 | 0.562 | 0.992 |
Generation | 50 | 0.695 | 0.926 | 0.524 | 0.996 |
Unknown | 16 | 0.690 | 0.894 | 0.529 | 1.000 |
Translation | 50 | 0.664 | 0.937 | 0.462 | 0.976 |
Coding | 50 | 0.642 | 0.855 | 0.451 | 0.988 |
Key Strengths
✅ High Overall Quality: Achieves 0.704 overall quality score, meeting recommended standards
✅ Excellent Fluency: Strong fluency score of 0.914 across all categories
✅ Superior Safety: Very high safety score of 0.984 with minimal hallucination risk (0.002)
✅ Best Performance: Excels in entertainment content generation (0.749 quality score)
✅ Low Hallucination Risk: Extremely low hallucination risk of 0.002
Areas for Improvement
📊 Coherence: Moderate coherence score (0.371) could benefit from improvement
📊 Coding Tasks: Lower performance in coding category (0.642) - area for future enhancement
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("sabaridsnfuji/Qwen3-1.7B-tamil-16bit-Instruct")
tokenizer = AutoTokenizer.from_pretrained("sabaridsnfuji/Qwen3-1.7B-tamil-16bit-Instruct")
# Example usage
prompt = "உங்கள் கேள்வி இங்கே:" # Your question here:
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Intended Use
This model is designed for:
- Tamil text generation and completion
- Question-answering in Tamil
- Entertainment content creation
- Literature and creative writing
- General conversation in Tamil
- Translation tasks (with noted limitations)
Limitations
- Coding performance is below optimal levels
- Coherence scores indicate room for improvement in maintaining logical flow
- Translation tasks show lower relevance scores
- Performance may vary significantly across different domains
Ethical Considerations
The model maintains high safety standards (0.984) and extremely low hallucination risk (0.002), making it suitable for responsible AI applications. However, users should always review outputs for accuracy, especially for critical applications.
Citation
If you use this model, please cite:
@misc{qwen3-tamil-instruct,
title={Qwen3-1.7B-Tamil-16bit-Instruct},
author={Sabari Nathan},
year={2025},
url={https://huggingface.co/sabaridsnfuji/Qwen3-1.7B-tamil-16bit-Instruct}
}
- Downloads last month
- 133
Model tree for sabaridsnfuji/Qwen3-1.7B-tamil-16bit-Instruct
Unable to build the model tree, the base model loops to the model itself. Learn more.