Update README.md
Browse files
README.md
CHANGED
@@ -30,32 +30,25 @@ base_model:
|
|
30 |
|
31 |
## TL;DR & Quick results
|
32 |
|
33 |
-
Try it on [Space demo](https://huggingface.co/spaces/XSkills/nllb-turkmen-english) Article with full technical journey is available [Medium]().
|
34 |
-
|
35 |
-
### Test Results
|
36 |
-
|
37 |
-
| Direction | BLEU ↑ | chrF ↑ | TER ↓ | Test pairs |
|
38 |
-
|-----------|-------:|-------:|------:|-----------:|
|
39 |
-
| **tk → en** | **26.07** | 52.97 | 68.39 | 50 |
|
40 |
-
| **en → tk** | **8.13** | 39.39 | 87.30 | 50 |
|
41 |
|
42 |
### Model Comparison (Fine-tuned vs Original)
|
43 |
|
44 |
-
|
45 |
|
46 |
-
| Metric
|
47 |
-
|
48 |
-
| BLEU
|
49 |
-
| chrF
|
50 |
-
| TER
|
51 |
|
52 |
-
####
|
53 |
|
54 |
-
| Metric
|
55 |
-
|
56 |
-
| BLEU
|
57 |
-
| chrF
|
58 |
-
| TER
|
59 |
|
60 |
*Scores computed with sacre BLEU 2.5, chrF, TER on the official `test` split.
|
61 |
A separate spreadsheet with **human adequacy/fluency ratings** is available in the article.*
|
@@ -180,6 +173,35 @@ A manual review on 50 random test sentences showed:
|
|
180 |
- Gender/ politeness nuances not guaranteed.
|
181 |
- CC-BY-NC licence forbids commercial use; respect Meta’s original terms.
|
182 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
183 |
## Citation
|
184 |
```bibtex
|
185 |
@misc{durdyyev2025turkmenNLLBLoRA,
|
|
|
30 |
|
31 |
## TL;DR & Quick results
|
32 |
|
33 |
+
Try it on [Space demo](https://huggingface.co/spaces/XSkills/nllb-turkmen-english) Article with full technical journey is available [Medium](https://medium.com/@meinnps/fine-tuning-nllb-200-with-lora-on-a-650-sentence-turkmen-english-corpus-082f68bdec71).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
34 |
|
35 |
### Model Comparison (Fine-tuned vs Original)
|
36 |
|
37 |
+
#### English to Turkmen
|
38 |
|
39 |
+
| Metric | Fine-tuned | Original | Difference |
|
40 |
+
|---------------------------|-----------:|---------:|-----------:|
|
41 |
+
| **BLEU** | 8.24 | 8.12 | +0.12 |
|
42 |
+
| **chrF** | 39.55 | 39.46 | +0.09 |
|
43 |
+
| **TER (lower is better)** | 87.20 | 87.30 | -0.10 |
|
44 |
|
45 |
+
#### Turkmen to English
|
46 |
|
47 |
+
| Metric | Fine-tuned | Original | Difference |
|
48 |
+
|---------------------------|-----------:|---------:|-----------:|
|
49 |
+
| **BLEU** | 25.88 | 26.48 | -0.60 |
|
50 |
+
| **chrF** | 52.71 | 52.91 | -0.20 |
|
51 |
+
| **TER (lower is better)** | 67.70 | 69.70 | -2.00 |
|
52 |
|
53 |
*Scores computed with sacre BLEU 2.5, chrF, TER on the official `test` split.
|
54 |
A separate spreadsheet with **human adequacy/fluency ratings** is available in the article.*
|
|
|
173 |
- Gender/ politeness nuances not guaranteed.
|
174 |
- CC-BY-NC licence forbids commercial use; respect Meta’s original terms.
|
175 |
|
176 |
+
## How to Contribute
|
177 |
+
|
178 |
+
We welcome contributions to improve Turkmen-English translation capabilities! Here's how you can help:
|
179 |
+
|
180 |
+
### Data Contributions
|
181 |
+
- **Read Dataset Contribution**: You can find the instructions for contributing to the dataset at [Dataset Readme](https://huggingface.co/datasets/XSkills/turkmen_english_s500/blob/main/README.md)
|
182 |
+
|
183 |
+
### Code Contributions
|
184 |
+
- **Hyperparameter experiments**: Try different LoRA configurations and document your results
|
185 |
+
- **Evaluation**: Help with human evaluation of translation quality and fluency
|
186 |
+
- **Bug fixes**: Report issues or submit fixes for the model implementation
|
187 |
+
|
188 |
+
### Use Cases & Documentation
|
189 |
+
- **Example applications**: Share how you're using the model for research or projects
|
190 |
+
- **Domain-specific guides**: Create guides for using the model in specific domains
|
191 |
+
- **Translation examples**: Share interesting or challenging translation examples
|
192 |
+
|
193 |
+
### Getting Started
|
194 |
+
1. Fork the repository
|
195 |
+
2. Make your changes
|
196 |
+
3. Submit a pull request with clear documentation of your contribution
|
197 |
+
4. For data contributions, contact the maintainer directly
|
198 |
+
|
199 |
+
All contributors will be acknowledged in the model documentation. Contact [[email protected]](mailto:[email protected]) with any questions or to discuss potential contributions.
|
200 |
+
|
201 |
+
---
|
202 |
+
|
203 |
+
*Note: This model is licensed under CC-BY-NC-4.0, so all contributions must be compatible with non-commercial use only.*
|
204 |
+
|
205 |
## Citation
|
206 |
```bibtex
|
207 |
@misc{durdyyev2025turkmenNLLBLoRA,
|