πŸ“ T5-small fine-tuned on XSUM for Summarization

This model is a T5-small fine-tuned on the Extreme Summarization (XSUM) dataset for abstractive summarization.
It generates a one-sentence summary for a given news article.

πŸ“¦ Model Details

  • Base model: T5-small
  • Task: Abstractive summarization
  • Language: English
  • Dataset: XSUM
  • Fine-tuning steps: 1 epoch or 12753 steps
  • Max input length: 1024 tokens
  • Max target length: 128 tokens

πŸš€ How to Use

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

model = AutoModelForSeq2SeqLM.from_pretrained("ShahzebKhoso/T5-small-xsum")
tokenizer = AutoTokenizer.from_pretrained("ShahzebKhoso/T5-small-xsum")

text = "The Prime Minister held a meeting today with the cabinet..."
inputs = tokenizer("summarize: " + text, return_tensors="pt", truncation=True)

summary_ids = model.generate(**inputs, max_length=64, min_length=10, length_penalty=2.0, num_beams=4)
print(tokenizer.decode(summary_ids[0], skip_special_tokens=True))

πŸ“Š Evaluation Results

πŸ“Š Evaluation Results

Evaluated on the XSUM validation set:

Metric Score
ROUGE-1 28.591
ROUGE-2 7.8217
ROUGE-L 22.38
ROUGE-Lsum 22.382
Gen. Len. 19.71

⚠️ Limitations and Bias

  • The model is trained only on English news articles from the XSUM dataset.

  • May hallucinate facts not present in the source text.

  • Summaries are very short (one sentence), consistent with XSUM style.

Downloads last month
15
Safetensors
Model size
60.5M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train ShahzebKhoso/T5-small-xsum