tasal9 commited on
Commit
3773e2c
Β·
verified Β·
1 Parent(s): 3d65208

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +34 -345
README.md CHANGED
@@ -10,396 +10,85 @@ tags:
10
  - education
11
  - tutoring
12
  - multilingual
13
- - zamai
14
  base_model: mistralai/Mistral-7B-Instruct-v0.1
15
  pipeline_tag: text-generation
16
  datasets:
17
  - tasal9/Pashto-Dataset-Creating-Dataset
18
- widget:
19
- - text: "Hello, how can I help you today?"
20
- example_title: "English Greeting"
21
- - text: "Ψ³Ω„Ψ§Ω… ΩˆΨ±ΩˆΨ±Ω‡ΨŒ Ϊ…Ω†Ϊ«Ω‡ یاسΨͺ؟"
22
- example_title: "Pashto Greeting"
23
- model-index:
24
- - name: ZamAI-Mistral-7B-Pashto
25
- results:
26
- - task:
27
- type: text-generation
28
- name: Text Generation
29
- dataset:
30
- type: custom
31
- name: Pashto Educational Dataset
32
- metrics:
33
- - type: accuracy
34
- value: 92.5
35
- name: Overall Accuracy
36
- - type: bleu
37
- value: 0.85
38
- name: BLEU Score
39
  ---
40
 
41
  # ZamAI-Mistral-7B-Pashto
42
 
43
- <div align="center">
44
- <img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo.png" alt="Hugging Face" width="100"/>
45
- <h2>🌟 Part of ZamAI Pro Models Strategy</h2>
46
- <p><strong>Fine-tuned Mistral-7B for educational tutoring with Pashto language support</strong></p>
47
- </div>
48
 
49
  ## 🌟 Model Overview
50
 
51
- ZamAI-Mistral-7B-Pashto is an advanced AI model specifically designed for multilingual applications with specialized focus on Pashto language support. This model is part of the comprehensive **ZamAI Pro Models Strategy**, aimed at bridging language gaps and providing high-quality AI solutions for underrepresented languages.
52
 
53
- ### 🎯 Key Features
54
-
55
- - 🧠 **Advanced Architecture**: Built on mistralai/Mistral-7B-Instruct-v0.1
56
- - 🌐 **Multilingual Support**: Optimized for Pashto (ps) and English (en)
57
  - ⚑ **High Performance**: Optimized for production deployment
58
- - πŸ”’ **Enterprise-Grade**: Secure and reliable for business use
59
- - πŸ“± **Production-Ready**: Tested and deployed in real applications
60
- - πŸŽ“ **Educational Focus**: Designed for learning and cultural preservation
61
-
62
- ## 🎯 Use Cases & Applications
63
-
64
- This model excels in the following scenarios:
65
-
66
- - **Educational Content Generation**: Advanced text generation capabilities
67
- - **Pashto Language Tutoring**: Advanced text generation capabilities
68
- - **Interactive Q&A Systems**: Advanced text generation capabilities
69
- - **Cultural Learning Applications**: Advanced text generation capabilities
70
- - **Academic Research Support**: Advanced text generation capabilities
71
-
72
- ### 🌍 Real-World Applications
73
 
74
- - **πŸŽ“ Educational Platforms**: Powering Pashto language tutoring and learning systems
75
- - **πŸ“„ Business Automation**: Document processing, form analysis, and content generation
76
- - **🎀 Voice Applications**: Natural language understanding for voice assistants
77
- - **πŸ›οΈ Cultural Preservation**: Supporting Pashto language technology and digital preservation
78
- - **🌐 Translation Services**: Cross-lingual communication and content localization
79
- - **πŸ€– Chatbot Development**: Building intelligent conversational agents
80
 
81
- ## πŸ“š Quick Start
82
-
83
- ### πŸ”§ Installation
84
-
85
- ```bash
86
- pip install transformers torch huggingface_hub
87
- ```
88
-
89
- ### πŸš€ Basic Usage
90
 
91
  ```python
92
- from transformers import AutoTokenizer, AutoModelForCausalLM
93
- from huggingface_hub import InferenceClient
94
 
95
- # Method 1: Using Transformers (Local)
96
  tokenizer = AutoTokenizer.from_pretrained("tasal9/ZamAI-Mistral-7B-Pashto")
97
- model = AutoModelForCausalLM.from_pretrained("tasal9/ZamAI-Mistral-7B-Pashto")
98
 
99
- # Example text
100
  text = "Your input text here"
101
  inputs = tokenizer(text, return_tensors="pt")
102
-
103
- # Generate response
104
- with torch.no_grad():
105
- outputs = model.generate(
106
- **inputs,
107
- max_new_tokens=200,
108
- temperature=0.7,
109
- top_p=0.9,
110
- pad_token_id=tokenizer.eos_token_id
111
- )
112
-
113
- response = tokenizer.decode(outputs[0], skip_special_tokens=True)
114
- print(response)
115
  ```
116
 
117
- ### 🌐 Using Hugging Face Inference API
118
 
119
  ```python
120
  from huggingface_hub import InferenceClient
121
 
122
- # Initialize client
123
  client = InferenceClient(token="your_hf_token")
124
 
125
- # Generate text
126
  response = client.text_generation(
127
  model="tasal9/ZamAI-Mistral-7B-Pashto",
128
  prompt="Your prompt here",
129
- max_new_tokens=200,
130
- temperature=0.7,
131
- top_p=0.9
132
- )
133
-
134
- print(response)
135
- ```
136
-
137
- ### 🎯 Specialized Usage Examples
138
-
139
- #### English Query
140
- ```python
141
- prompt = "Explain the importance of renewable energy in simple terms:"
142
- response = client.text_generation(
143
- model="tasal9/ZamAI-Mistral-7B-Pashto",
144
- prompt=prompt,
145
- max_new_tokens=250,
146
- temperature=0.7
147
- )
148
- ```
149
-
150
- #### Pashto Query
151
- ```python
152
- prompt = "Ψ― Ψ¨Ψ΄ΩΎΪ“ پوښΨͺΩ†Ω‡: Ψ― Ϊ©Ψ±ΪšΩ†Ϋ ΩˆΨ±Ψ§Ω†Ϋ Ψ― Ϊ©Ψ±Ϊ©ΩΌΨ±ΩˆΩ†Ωˆ ΩΎΩ‡ Ψ§Ϊ“Ω‡ Ψͺاسو Ϊ…Ω‡ ΩΎΩˆΩ‡ یاسΨͺ؟"
153
- response = client.text_generation(
154
- model="tasal9/ZamAI-Mistral-7B-Pashto",
155
- prompt=prompt,
156
- max_new_tokens=250,
157
- temperature=0.7
158
- )
159
- ```
160
-
161
- ## πŸ”§ Technical Specifications
162
-
163
- | Specification | Details |
164
- |---------------|---------|
165
- | **Model Type** | Text Generation |
166
- | **Base Model** | mistralai/Mistral-7B-Instruct-v0.1 |
167
- | **Languages** | Pashto (ps), English (en) |
168
- | **License** | MIT |
169
- | **Context Length** | Variable (depends on base model) |
170
- | **Parameters** | Optimized for efficiency |
171
- | **Framework** | PyTorch, Transformers |
172
- | **Deployment** | HF Inference API, Local, Docker |
173
-
174
- ## πŸ“Š Performance Metrics
175
-
176
- | Metric | Score | Description |
177
- |--------|-------|-------------|
178
- | **Overall Accuracy** | 92.5% | Performance on Pashto evaluation dataset |
179
- | **BLEU Score** | 0.85 | Translation and generation quality |
180
- | **Cultural Relevance** | 95% | Appropriateness for Pashto cultural context |
181
- | **Response Time** | <200ms | Average inference time via API |
182
- | **Multilingual Score** | 89% | Cross-lingual understanding capability |
183
- | **Coherence Score** | 91% | Logical flow and consistency |
184
-
185
- ## 🌐 Interactive Demo
186
-
187
- Try the model instantly with our Gradio demos:
188
-
189
- ### 🎯 Live Demos
190
- - **[Complete Suite Demo](https://huggingface.co/spaces/tasal9/zamai-complete-suite)** - All models in one interface
191
- - **[Individual Model Demo](https://huggingface.co/spaces/tasal9/zamai-mistral-7b-pashto)** - Focused interface for this model
192
-
193
- ### πŸ”— API Endpoints
194
- - **Inference API**: `https://api-inference.huggingface.co/models/tasal9/ZamAI-Mistral-7B-Pashto`
195
- - **Model Hub**: `https://huggingface.co/tasal9/ZamAI-Mistral-7B-Pashto`
196
-
197
- ## πŸš€ Deployment Options
198
-
199
- ### 1. 🌐 Hugging Face Inference API (Recommended)
200
- ```python
201
- from huggingface_hub import InferenceClient
202
- client = InferenceClient(token="your_token")
203
- response = client.text_generation(model="tasal9/ZamAI-Mistral-7B-Pashto", prompt="Your prompt")
204
- ```
205
-
206
- ### 2. πŸ–₯️ Local Deployment
207
- ```bash
208
- # Clone the model
209
- git clone https://huggingface.co/tasal9/ZamAI-Mistral-7B-Pashto
210
- cd ZamAI-Mistral-7B-Pashto
211
-
212
- # Run with Python
213
- python -c "
214
- from transformers import pipeline
215
- pipe = pipeline('text-generation', model='.')
216
- print(pipe('Your prompt here'))
217
- "
218
- ```
219
-
220
- ### 3. 🐳 Docker Deployment
221
- ```dockerfile
222
- FROM python:3.9-slim
223
-
224
- RUN pip install transformers torch
225
-
226
- COPY . /app
227
- WORKDIR /app
228
-
229
- CMD ["python", "app.py"]
230
- ```
231
-
232
- ### 4. ☁️ Cloud Deployment
233
- Compatible with major cloud platforms:
234
- - **AWS SageMaker**
235
- - **Google Cloud AI Platform**
236
- - **Azure Machine Learning**
237
- - **Hugging Face Spaces**
238
-
239
- ## πŸ“ˆ Model Training & Fine-tuning
240
-
241
- ### 🎯 Training Data
242
- - **Primary Dataset**: Custom Pashto educational content
243
- - **Secondary Data**: Multilingual parallel corpora
244
- - **Domain Focus**: Educational, cultural, and conversational content
245
- - **Quality Assurance**: Human-reviewed and culturally validated
246
-
247
- ### πŸ”§ Fine-tuning Process
248
- ```python
249
- from transformers import TrainingArguments, Trainer
250
-
251
- # Example fine-tuning setup
252
- training_args = TrainingArguments(
253
- output_dir="./results",
254
- num_train_epochs=3,
255
- per_device_train_batch_size=4,
256
- per_device_eval_batch_size=4,
257
- warmup_steps=500,
258
- weight_decay=0.01,
259
- logging_dir="./logs",
260
- )
261
-
262
- # Initialize trainer
263
- trainer = Trainer(
264
- model=model,
265
- args=training_args,
266
- train_dataset=train_dataset,
267
- eval_dataset=eval_dataset,
268
  )
269
-
270
- # Start training
271
- trainer.train()
272
  ```
273
 
274
- ## 🀝 Community & Contributions
275
 
276
- ### πŸ“ Contributing
277
- We welcome contributions to improve this model:
 
 
 
278
 
279
- 1. **Data Contributions**: Share high-quality Pashto language datasets
280
- 2. **Model Improvements**: Suggest architectural enhancements or optimizations
281
- 3. **Use Case Development**: Build applications and share success stories
282
- 4. **Bug Reports**: Help us identify and fix issues
283
- 5. **Documentation**: Improve guides and examples
284
 
285
- ### 🌟 Community Projects
286
- - **Educational Apps**: Language learning applications
287
- - **Business Tools**: Document processing solutions
288
- - **Research**: Academic studies and papers
289
- - **Open Source**: Community-driven improvements
290
 
291
- ### πŸ“Š Usage Analytics
292
- - **Downloads**: Track model adoption
293
- - **Community Feedback**: User reviews and ratings
294
- - **Performance Reports**: Real-world usage statistics
295
 
296
- ## πŸ”— Related Models & Resources
297
-
298
- ### πŸ€– Other ZamAI Models
299
- - [**ZamAI-Mistral-7B-Pashto**](https://huggingface.co/tasal9/ZamAI-Mistral-7B-Pashto) - Educational tutor
300
- - [**ZamAI-Phi-3-Mini-Pashto**](https://huggingface.co/tasal9/ZamAI-Phi-3-Mini-Pashto) - Business assistant
301
- - [**ZamAI-Whisper-v3-Pashto**](https://huggingface.co/tasal9/ZamAI-Whisper-v3-Pashto) - Speech recognition
302
- - [**Multilingual-ZamAI-Embeddings**](https://huggingface.co/tasal9/Multilingual-ZamAI-Embeddings) - Text embeddings
303
- - [**ZamAI-LLaMA3-Pashto**](https://huggingface.co/tasal9/ZamAI-LLaMA3-Pashto) - Advanced chat
304
- - [**pashto-base-bloom**](https://huggingface.co/tasal9/pashto-base-bloom) - Lightweight model
305
-
306
- ### πŸ“š Datasets
307
- - [**Pashto-Dataset-Creating-Dataset**](https://huggingface.co/datasets/tasal9/Pashto-Dataset-Creating-Dataset) - Training data
308
-
309
- ### 🌐 Platform Links
310
- - **Organization**: [tasal9](https://huggingface.co/tasal9)
311
- - **Complete Demo**: [ZamAI Suite](https://huggingface.co/spaces/tasal9/zamai-complete-suite)
312
-
313
- ## πŸ“ž Support & Contact
314
-
315
- ### πŸ†˜ Getting Help
316
  - πŸ“§ **Email**: [email protected]
317
  - 🌐 **Website**: [zamai.ai](https://zamai.ai)
318
- - πŸ“– **Documentation**: [docs.zamai.ai](https://docs.zamai.ai)
319
- - πŸ’¬ **Community Forum**: [community.zamai.ai](https://community.zamai.ai)
320
- - πŸ™ **GitHub**: [github.com/zamai-ai](https://github.com/zamai-ai)
321
 
322
- ### πŸ’Ό Enterprise Support
323
- For enterprise deployments, custom fine-tuning, or integration assistance:
324
- - πŸ“§ **Enterprise**: [email protected]
325
- - πŸ“ž **Phone**: +1-XXX-XXX-XXXX
326
- - πŸ’Ό **Consulting**: [zamai.ai/consulting](https://zamai.ai/consulting)
327
 
328
- ## 🏷️ Citation
329
-
330
- If you use this model in your research or applications, please cite:
331
-
332
- ```bibtex
333
- @misc{zamai-zamai-mistral-7b-pashto-2024,
334
- title={ZamAI-Mistral-7B-Pashto: Fine-tuned Mistral-7B for educational tutoring with Pashto language support},
335
- author={ZamAI Team},
336
- year={2024},
337
- url={https://huggingface.co/tasal9/ZamAI-Mistral-7B-Pashto},
338
- note={ZamAI Pro Models Strategy - Multilingual AI Platform},
339
- publisher={Hugging Face}
340
- }
341
- ```
342
-
343
- ### πŸ“œ Academic Papers
344
- ```bibtex
345
- @article{zamai2024multilingual,
346
- title={Advancing Multilingual AI: The ZamAI Pro Models Strategy for Pashto Language Technology},
347
- author={ZamAI Research Team},
348
- journal={Journal of Multilingual AI},
349
- year={2024},
350
- volume={1},
351
- pages={1--15}
352
- }
353
- ```
354
-
355
- ## πŸ“„ License & Terms
356
-
357
- ### πŸ“‹ License
358
- This model is licensed under the **MIT License**:
359
-
360
- - βœ… **Commercial Use**: Allowed for business applications
361
- - βœ… **Modification**: Can be modified and improved
362
- - βœ… **Distribution**: Can be redistributed
363
- - βœ… **Private Use**: Allowed for personal projects
364
- - ⚠️ **Attribution Required**: Credit must be given to ZamAI
365
-
366
- ### πŸ“ Terms of Use
367
- 1. **Responsible AI**: Use ethically and responsibly
368
- 2. **No Harmful Content**: Do not generate harmful or offensive content
369
- 3. **Privacy**: Respect user privacy and data protection laws
370
- 4. **Cultural Sensitivity**: Be respectful of Pashto culture and language
371
- 5. **Compliance**: Follow local laws and regulations
372
-
373
- ### πŸ›‘οΈ Limitations & Disclaimers
374
- - Model outputs should be reviewed for accuracy
375
- - Not suitable for critical decision-making without human oversight
376
- - May have biases inherited from training data
377
- - Performance may vary across different domains
378
-
379
- ## πŸ“ˆ Changelog & Updates
380
-
381
- | Version | Date | Changes |
382
- |---------|------|---------|
383
- | **v1.0** | 2025-07-05 | Initial release with enhanced Pashto support |
384
- | **v1.1** | TBD | Performance optimizations and bug fixes |
385
- | **v2.0** | TBD | Extended language support and new features |
386
-
387
- ### πŸ”„ Update Schedule
388
- - **Monthly**: Performance monitoring and minor improvements
389
- - **Quarterly**: Feature updates and enhancements
390
- - **Annually**: Major version releases with significant improvements
391
 
392
  ---
393
 
394
- <div align="center">
395
- <h3>🌟 Part of the ZamAI Pro Models Strategy</h3>
396
- <p><strong>Transforming AI for Multilingual Applications</strong></p>
397
- <p>
398
- <a href="https://zamai.ai">🌐 Website</a> β€’
399
- <a href="https://huggingface.co/tasal9">πŸ€— Models</a> β€’
400
- <a href="https://community.zamai.ai">πŸ’¬ Community</a> β€’
401
- <a href="mailto:[email protected]">πŸ“§ Support</a>
402
- </p>
403
- <p><em>Last Updated: 2025-07-05 21:15:46 UTC</em></p>
404
- <p><em>Model Card Version: 2.0</em></p>
405
- </div>
 
10
  - education
11
  - tutoring
12
  - multilingual
 
13
  base_model: mistralai/Mistral-7B-Instruct-v0.1
14
  pipeline_tag: text-generation
15
  datasets:
16
  - tasal9/Pashto-Dataset-Creating-Dataset
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  ---
18
 
19
  # ZamAI-Mistral-7B-Pashto
20
 
21
+ Fine-tuned Mistral-7B for educational tutoring with Pashto language support
 
 
 
 
22
 
23
  ## 🌟 Model Overview
24
 
25
+ This model is part of the **ZamAI Pro Models Strategy** - a comprehensive AI platform designed for multilingual applications with specialized focus on Pashto language support.
26
 
27
+ ### Key Features
28
+ - 🧠 **Advanced AI**: Based on mistralai/Mistral-7B-Instruct-v0.1 architecture
29
+ - 🌐 **Multilingual**: Optimized for Pashto and English
 
30
  - ⚑ **High Performance**: Optimized for production deployment
31
+ - πŸ”’ **Secure**: Enterprise-grade security and privacy
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
 
33
+ ## πŸ“š Usage
 
 
 
 
 
34
 
35
+ ### Basic Usage with Transformers
 
 
 
 
 
 
 
 
36
 
37
  ```python
38
+ from transformers import AutoTokenizer, AutoModel
 
39
 
 
40
  tokenizer = AutoTokenizer.from_pretrained("tasal9/ZamAI-Mistral-7B-Pashto")
41
+ model = AutoModel.from_pretrained("tasal9/ZamAI-Mistral-7B-Pashto")
42
 
43
+ # Example usage
44
  text = "Your input text here"
45
  inputs = tokenizer(text, return_tensors="pt")
46
+ outputs = model(**inputs)
 
 
 
 
 
 
 
 
 
 
 
 
47
  ```
48
 
49
+ ### Usage with Hugging Face Inference API
50
 
51
  ```python
52
  from huggingface_hub import InferenceClient
53
 
 
54
  client = InferenceClient(token="your_hf_token")
55
 
 
56
  response = client.text_generation(
57
  model="tasal9/ZamAI-Mistral-7B-Pashto",
58
  prompt="Your prompt here",
59
+ max_new_tokens=200
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
60
  )
 
 
 
61
  ```
62
 
63
+ ## πŸ”§ Technical Details
64
 
65
+ - **Model Type**: text-generation
66
+ - **Base Model**: mistralai/Mistral-7B-Instruct-v0.1
67
+ - **Languages**: Pashto (ps), English (en)
68
+ - **License**: MIT
69
+ - **Training**: Fine-tuned on Pashto educational and cultural content
70
 
71
+ ## πŸš€ Applications
 
 
 
 
72
 
73
+ This model powers:
74
+ - **ZamAI Educational Platform**: Pashto language tutoring
75
+ - **Business Automation**: Document processing and analysis
76
+ - **Voice Assistants**: Natural language understanding
77
+ - **Cultural Preservation**: Supporting Pashto language technology
78
 
79
+ ## πŸ“ž Support
 
 
 
80
 
81
+ For support and integration assistance:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
82
  - πŸ“§ **Email**: [email protected]
83
  - 🌐 **Website**: [zamai.ai](https://zamai.ai)
84
+ - πŸ’¬ **Community**: [ZamAI Community](https://community.zamai.ai)
 
 
85
 
86
+ ## πŸ“„ License
 
 
 
 
87
 
88
+ Licensed under the MIT License.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
89
 
90
  ---
91
 
92
+ **Part of the ZamAI Pro Models Strategy - Transforming AI for Multilingual Applications** 🌟
93
+
94
+ *Updated: 2025-07-05 21:29:09 UTC*