metadata

license: mit
language:
  - tr
metrics:
  - rouge
  - meteor
base_model:
  - google/umt5-small
pipeline_tag: text2text-generation

📝 umt5-small Turkish Abstractive Summarization

🧠 Abstract

This model presents a fine-tuned version of umt5-small, specifically adapted for abstractive summarization of Turkish-language texts. Leveraging the multilingual capabilities of the original mT5 architecture, the model has been trained on a high-quality Turkish summarization dataset containing diverse news articles and their human-written summaries. The goal of this model is to generate coherent, concise, and semantically accurate summaries from long-form Turkish content, making it suitable for real-world applications such as news aggregation, document compression, and information retrieval.

Despite its small size (~60M parameters), the model demonstrates strong performance across standard evaluation metrics including ROUGE and METEOR, achieving results within the commonly accepted thresholds for Turkish-language summarization tasks. It strikes a practical balance between efficiency and quality, making it ideal for use in resource-constrained environments.

🔍 Metric Interpretation (Specific to Turkish)

ROUGE-1: Measures unigram (word-level) overlap between the generated summary and the reference text. For Turkish summarization tasks, scores below 0.30 generally indicate weak lexical alignment, while scores above 0.40 are considered strong and fluent outputs.
ROUGE-2: Evaluates bigram (two-word sequence) overlap. Since Turkish is an agglutinative language with rich morphology, achieving high bigram overlap is more difficult. Therefore, a range between 0.15–0.30 is considered average and acceptable for Turkish.
ROUGE-L: Captures the longest common subsequence, reflecting sentence-level fluency and structure similarity. Acceptable ranges for Turkish are generally close to ROUGE-1, typically between 0.28–0.40.
METEOR: Unlike ROUGE, METEOR also incorporates semantic similarity and synonymy. It performs relatively well on morphologically rich languages like Turkish. Scores in the range of 0.25–0.38 are commonly observed and considered good in Turkish summarization settings.

📊 Acceptable Metric Ranges

Metric	Acceptable Range	Interpretation
ROUGE-1	0.30 – 0.45	Weak < 0.30, Good > 0.40
ROUGE-2	0.15 – 0.30	Typical for bigram-level
ROUGE-L	0.28 – 0.40	Similar to ROUGE-1
METEOR	0.25 – 0.38	Balanced lexical & semantic match

🚀 Usage Example

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("your_username/umt5-small-turkish-summary")
model = AutoModelForSeq2SeqLM.from_pretrained("your_username/umt5-small-turkish-summary")

text = "Insert Turkish text to summarize."
inputs = tokenizer(text, return_tensors="pt", max_length=1024, truncation=True)

summary_ids = model.generate(
    **inputs,
    max_length=100,
    num_beams=4,
    early_stopping=True
)

summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
print(summary)