|
--- |
|
license: mit |
|
language: |
|
- tr |
|
metrics: |
|
- rouge |
|
- meteor |
|
base_model: |
|
- google/umt5-small |
|
pipeline_tag: text2text-generation |
|
--- |
|
|
|
# π umt5-small Turkish Abstractive Summarization |
|
|
|
## π§ Abstract |
|
|
|
This model presents a fine-tuned version of `umt5-small`, specifically adapted for **abstractive summarization** of Turkish-language texts. Leveraging the multilingual capabilities of the original mT5 architecture, the model has been trained on a high-quality Turkish summarization dataset containing diverse news articles and their human-written summaries. The goal of this model is to generate coherent, concise, and semantically accurate summaries from long-form Turkish content, making it suitable for real-world applications such as news aggregation, document compression, and information retrieval. |
|
|
|
Despite its small size (~60M parameters), the model demonstrates strong performance across standard evaluation metrics including **ROUGE** and **METEOR**, achieving results within the commonly accepted thresholds for Turkish-language summarization tasks. It strikes a practical balance between efficiency and quality, making it ideal for use in resource-constrained environments. |
|
|
|
--- |
|
|
|
## π Metric Interpretation (Specific to Turkish) |
|
|
|
- **ROUGE-1:** Measures unigram (word-level) overlap between the generated summary and the reference text. For Turkish summarization tasks, scores below **0.30** generally indicate weak lexical alignment, while scores above **0.40** are considered strong and fluent outputs. |
|
|
|
- **ROUGE-2:** Evaluates bigram (two-word sequence) overlap. Since Turkish is an agglutinative language with rich morphology, achieving high bigram overlap is more difficult. Therefore, a range between **0.15β0.30** is considered average and acceptable for Turkish. |
|
|
|
- **ROUGE-L:** Captures the longest common subsequence, reflecting sentence-level fluency and structure similarity. Acceptable ranges for Turkish are generally close to ROUGE-1, typically between **0.28β0.40**. |
|
|
|
- **METEOR:** Unlike ROUGE, METEOR also incorporates semantic similarity and synonymy. It performs relatively well on morphologically rich languages like Turkish. Scores in the range of **0.25β0.38** are commonly observed and considered good in Turkish summarization settings. |
|
|
|
--- |
|
|
|
## π Acceptable Metric Ranges |
|
|
|
| Metric | Acceptable Range | Interpretation | |
|
|----------|------------------|-----------------------------------| |
|
| ROUGE-1 | 0.30 β 0.45 | Weak < 0.30, Good > 0.40 | |
|
| ROUGE-2 | 0.15 β 0.30 | Typical for bigram-level | |
|
| ROUGE-L | 0.28 β 0.40 | Similar to ROUGE-1 | |
|
| METEOR | 0.25 β 0.38 | Balanced lexical & semantic match | |
|
|
|
--- |
|
|
|
## π Usage Example |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("your_username/umt5-small-turkish-summary") |
|
model = AutoModelForSeq2SeqLM.from_pretrained("your_username/umt5-small-turkish-summary") |
|
|
|
text = "Insert Turkish text to summarize." |
|
inputs = tokenizer(text, return_tensors="pt", max_length=1024, truncation=True) |
|
|
|
summary_ids = model.generate( |
|
**inputs, |
|
max_length=100, |
|
num_beams=4, |
|
early_stopping=True |
|
) |
|
|
|
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True) |
|
print(summary) |
|
|
|
|