RegenAI's picture
Create README.md
afee77b verified
|
raw
history blame
3.34 kB
---
license: mit
language:
- tr
metrics:
- rouge
- meteor
base_model:
- google/umt5-small
pipeline_tag: text2text-generation
---
# πŸ“ umt5-small Turkish Abstractive Summarization
## 🧠 Abstract
This model presents a fine-tuned version of `umt5-small`, specifically adapted for **abstractive summarization** of Turkish-language texts. Leveraging the multilingual capabilities of the original mT5 architecture, the model has been trained on a high-quality Turkish summarization dataset containing diverse news articles and their human-written summaries. The goal of this model is to generate coherent, concise, and semantically accurate summaries from long-form Turkish content, making it suitable for real-world applications such as news aggregation, document compression, and information retrieval.
Despite its small size (~60M parameters), the model demonstrates strong performance across standard evaluation metrics including **ROUGE** and **METEOR**, achieving results within the commonly accepted thresholds for Turkish-language summarization tasks. It strikes a practical balance between efficiency and quality, making it ideal for use in resource-constrained environments.
---
## πŸ” Metric Interpretation (Specific to Turkish)
- **ROUGE-1:** Measures unigram (word-level) overlap between the generated summary and the reference text. For Turkish summarization tasks, scores below **0.30** generally indicate weak lexical alignment, while scores above **0.40** are considered strong and fluent outputs.
- **ROUGE-2:** Evaluates bigram (two-word sequence) overlap. Since Turkish is an agglutinative language with rich morphology, achieving high bigram overlap is more difficult. Therefore, a range between **0.15–0.30** is considered average and acceptable for Turkish.
- **ROUGE-L:** Captures the longest common subsequence, reflecting sentence-level fluency and structure similarity. Acceptable ranges for Turkish are generally close to ROUGE-1, typically between **0.28–0.40**.
- **METEOR:** Unlike ROUGE, METEOR also incorporates semantic similarity and synonymy. It performs relatively well on morphologically rich languages like Turkish. Scores in the range of **0.25–0.38** are commonly observed and considered good in Turkish summarization settings.
---
## πŸ“Š Acceptable Metric Ranges
| Metric | Acceptable Range | Interpretation |
|----------|------------------|-----------------------------------|
| ROUGE-1 | 0.30 – 0.45 | Weak < 0.30, Good > 0.40 |
| ROUGE-2 | 0.15 – 0.30 | Typical for bigram-level |
| ROUGE-L | 0.28 – 0.40 | Similar to ROUGE-1 |
| METEOR | 0.25 – 0.38 | Balanced lexical & semantic match |
---
## πŸš€ Usage Example
```python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("your_username/umt5-small-turkish-summary")
model = AutoModelForSeq2SeqLM.from_pretrained("your_username/umt5-small-turkish-summary")
text = "Insert Turkish text to summarize."
inputs = tokenizer(text, return_tensors="pt", max_length=1024, truncation=True)
summary_ids = model.generate(
**inputs,
max_length=100,
num_beams=4,
early_stopping=True
)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
print(summary)