sabaridsnfuji commited on
Commit
d96e143
·
verified ·
1 Parent(s): 104fef3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +115 -5
README.md CHANGED
@@ -7,15 +7,125 @@ tags:
7
  - qwen3
8
  license: apache-2.0
9
  language:
 
10
  - en
11
  ---
 
12
 
13
- # Uploaded finetuned model
 
 
14
 
15
  - **Developed by:** sabaridsnfuji
16
- - **License:** apache-2.0
17
- - **Finetuned from model :** sabaridsnfuji/Qwen3-1.7B-tamil-16bit-Instruct
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
 
19
- This qwen3 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
20
 
21
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
7
  - qwen3
8
  license: apache-2.0
9
  language:
10
+ - ta
11
  - en
12
  ---
13
+ # Qwen3-1.7B-Tamil-16bit-Instruct
14
 
15
+ ## Model Description
16
+
17
+ This is a fine-tuned version of Qwen3-1.7B specifically optimized for Tamil language tasks. The model has been trained to understand and generate Tamil text across various domains including coding, entertainment, question-answering, reasoning, literature, ethics, and translation.
18
 
19
  - **Developed by:** sabaridsnfuji
20
+ - **Model type:** Causal Language Model
21
+ - **Language:** Tamil
22
+ - **License:** Apache 2.0
23
+ - **Base model:** Qwen3-1.7B
24
+ - **Parameter count:** 1.7B
25
+ - **Precision:** 16-bit
26
+
27
+ ## Training Details
28
+
29
+ This Qwen3 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Hugging Face's TRL library.
30
+
31
+ ### Training Dataset
32
+ - **Dataset:** [abhinand/tamil-alpaca-orca](https://huggingface.co/datasets/abhinand/tamil-alpaca-orca)
33
+ - **Description:** A comprehensive Tamil instruction-following dataset based on Alpaca and Orca methodologies
34
+
35
+ ## Evaluation
36
+
37
+ ### Evaluation Dataset
38
+ - **Dataset:** [abhinand/tamil-llama-eval](https://huggingface.co/datasets/abhinand/tamil-llama-eval)
39
+ - **Evaluation Date:** 2025-07-20
40
+ - **Total Samples:** 466
41
+
42
+ ### Overall Performance Metrics
43
+
44
+ | Metric | Score | Standard Deviation |
45
+ |--------|-------|-------------------|
46
+ | **Overall Quality** | **0.704** | 0.032 |
47
+ | **Fluency** | **0.914** | 0.023 |
48
+ | **Relevance** | **0.565** | 0.078 |
49
+ | **Coherence** | **0.371** | 0.061 |
50
+ | **Completeness** | **0.750** | 0.039 |
51
+ | **Safety Score** | **0.984** | 0.009 |
52
+ | **Hallucination Risk** | **0.002** | 0.004 |
53
+ | **Perplexity** | 174.942 | 904.409 |
54
+
55
+ ### Category-wise Performance
56
+
57
+ | Category | Samples | Overall Quality | Fluency | Relevance | Safety |
58
+ |----------|---------|----------------|---------|-----------|--------|
59
+ | **Entertainment** | 50 | **0.749** | 0.911 | 0.711 | 0.974 |
60
+ | **Reasoning** | 50 | **0.740** | 0.920 | 0.574 | 0.968 |
61
+ | **Open QA** | 50 | **0.722** | 0.933 | 0.656 | 0.984 |
62
+ | **Literature** | 50 | **0.718** | 0.921 | 0.597 | 0.992 |
63
+ | **QA** | 50 | **0.711** | 0.909 | 0.556 | 0.980 |
64
+ | **Ethics** | 50 | **0.700** | 0.921 | 0.562 | 0.992 |
65
+ | **Generation** | 50 | **0.695** | 0.926 | 0.524 | 0.996 |
66
+ | **Unknown** | 16 | **0.690** | 0.894 | 0.529 | 1.000 |
67
+ | **Translation** | 50 | **0.664** | 0.937 | 0.462 | 0.976 |
68
+ | **Coding** | 50 | **0.642** | 0.855 | 0.451 | 0.988 |
69
+
70
+ ## Key Strengths
71
+
72
+ ✅ **High Overall Quality:** Achieves 0.704 overall quality score, meeting recommended standards
73
+ ✅ **Excellent Fluency:** Strong fluency score of 0.914 across all categories
74
+ ✅ **Superior Safety:** Very high safety score of 0.984 with minimal hallucination risk (0.002)
75
+ ✅ **Best Performance:** Excels in entertainment content generation (0.749 quality score)
76
+ ✅ **Low Hallucination Risk:** Extremely low hallucination risk of 0.002
77
+
78
+ ## Areas for Improvement
79
+
80
+ 📊 **Coherence:** Moderate coherence score (0.371) could benefit from improvement
81
+ 📊 **Coding Tasks:** Lower performance in coding category (0.642) - area for future enhancement
82
+
83
+ ## Usage
84
+
85
+ ```python
86
+ from transformers import AutoModelForCausalLM, AutoTokenizer
87
+
88
+ model = AutoModelForCausalLM.from_pretrained("sabaridsnfuji/Qwen3-1.7B-tamil-16bit-Instruct")
89
+ tokenizer = AutoTokenizer.from_pretrained("sabaridsnfuji/Qwen3-1.7B-tamil-16bit-Instruct")
90
+
91
+ # Example usage
92
+ prompt = "உங்கள் கேள்வி இங்கே:" # Your question here:
93
+ inputs = tokenizer(prompt, return_tensors="pt")
94
+ outputs = model.generate(**inputs, max_length=100, temperature=0.7)
95
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
96
+ print(response)
97
+ ```
98
+
99
+ ## Intended Use
100
+
101
+ This model is designed for:
102
+ - Tamil text generation and completion
103
+ - Question-answering in Tamil
104
+ - Entertainment content creation
105
+ - Literature and creative writing
106
+ - General conversation in Tamil
107
+ - Translation tasks (with noted limitations)
108
+
109
+ ## Limitations
110
+
111
+ - Coding performance is below optimal levels
112
+ - Coherence scores indicate room for improvement in maintaining logical flow
113
+ - Translation tasks show lower relevance scores
114
+ - Performance may vary significantly across different domains
115
+
116
+ ## Ethical Considerations
117
+
118
+ The model maintains high safety standards (0.984) and extremely low hallucination risk (0.002), making it suitable for responsible AI applications. However, users should always review outputs for accuracy, especially for critical applications.
119
+
120
+ ## Citation
121
 
122
+ If you use this model, please cite:
123
 
124
+ ```bibtex
125
+ @misc{qwen3-tamil-instruct,
126
+ title={Qwen3-1.7B-Tamil-16bit-Instruct},
127
+ author={Sabari Nathan},
128
+ year={2025},
129
+ url={https://huggingface.co/sabaridsnfuji/Qwen3-1.7B-tamil-16bit-Instruct}
130
+ }
131
+ ```