---
license: mit
language:
- tr
metrics:
- rouge
- meteor
base_model:
- google/umt5-small
pipeline_tag: text2text-generation
---

# 📝 umt5-small Turkish Abstractive Summarization

## 🧠 Abstract

This model presents a fine-tuned version of `umt5-small`, specifically adapted for **abstractive summarization** of Turkish-language texts. Leveraging the multilingual capabilities of the original mT5 architecture, the model has been trained on a high-quality Turkish summarization dataset containing diverse news articles and their human-written summaries. The goal of this model is to generate coherent, concise, and semantically accurate summaries from long-form Turkish content, making it suitable for real-world applications such as news aggregation, document compression, and information retrieval.

Despite its small size (~60M parameters), the model demonstrates strong performance across standard evaluation metrics including **ROUGE** and **METEOR**, achieving results within the commonly accepted thresholds for Turkish-language summarization tasks. It strikes a practical balance between efficiency and quality, making it ideal for use in resource-constrained environments.

---

## 🔍 Metric Interpretation (Specific to Turkish)

- **ROUGE-1:** Measures unigram (word-level) overlap between the generated summary and the reference text. For Turkish summarization tasks, scores below **0.30** generally indicate weak lexical alignment, while scores above **0.40** are considered strong and fluent outputs.

- **ROUGE-2:** Evaluates bigram (two-word sequence) overlap. Since Turkish is an agglutinative language with rich morphology, achieving high bigram overlap is more difficult. Therefore, a range between **0.15–0.30** is considered average and acceptable for Turkish.

- **ROUGE-L:** Captures the longest common subsequence, reflecting sentence-level fluency and structure similarity. Acceptable ranges for Turkish are generally close to ROUGE-1, typically between **0.28–0.40**.

- **METEOR:** Unlike ROUGE, METEOR also incorporates semantic similarity and synonymy. It performs relatively well on morphologically rich languages like Turkish. Scores in the range of **0.25–0.38** are commonly observed and considered good in Turkish summarization settings.

---

## 📊 Acceptable Metric Ranges

| Metric   | Acceptable Range | Interpretation                    |
|----------|------------------|-----------------------------------|
| ROUGE-1  | 0.30 – 0.45      | Weak < 0.30, Good > 0.40          |
| ROUGE-2  | 0.15 – 0.30      | Typical for bigram-level          |
| ROUGE-L  | 0.28 – 0.40      | Similar to ROUGE-1                |
| METEOR   | 0.25 – 0.38      | Balanced lexical & semantic match |

---

## 🚀 Usage Example

```python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("your_username/umt5-small-turkish-summary")
model = AutoModelForSeq2SeqLM.from_pretrained("your_username/umt5-small-turkish-summary")

text = "Insert Turkish text to summarize."
inputs = tokenizer(text, return_tensors="pt", max_length=1024, truncation=True)

summary_ids = model.generate(
    **inputs,
    max_length=100,
    num_beams=4,
    early_stopping=True
)

summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
print(summary)