π T5-small fine-tuned on XSUM for Summarization
This model is a T5-small fine-tuned on the Extreme Summarization (XSUM) dataset for abstractive summarization.
It generates a one-sentence summary for a given news article.
π¦ Model Details
- Base model: T5-small
- Task: Abstractive summarization
- Language: English
- Dataset: XSUM
- Fine-tuning steps: 1 epoch or 12753 steps
- Max input length: 1024 tokens
- Max target length: 128 tokens
π How to Use
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model = AutoModelForSeq2SeqLM.from_pretrained("ShahzebKhoso/T5-small-xsum")
tokenizer = AutoTokenizer.from_pretrained("ShahzebKhoso/T5-small-xsum")
text = "The Prime Minister held a meeting today with the cabinet..."
inputs = tokenizer("summarize: " + text, return_tensors="pt", truncation=True)
summary_ids = model.generate(**inputs, max_length=64, min_length=10, length_penalty=2.0, num_beams=4)
print(tokenizer.decode(summary_ids[0], skip_special_tokens=True))
π Evaluation Results
π Evaluation Results
Evaluated on the XSUM validation set:
Metric | Score |
---|---|
ROUGE-1 | 28.591 |
ROUGE-2 | 7.8217 |
ROUGE-L | 22.38 |
ROUGE-Lsum | 22.382 |
Gen. Len. | 19.71 |
β οΈ Limitations and Bias
The model is trained only on English news articles from the XSUM dataset.
May hallucinate facts not present in the source text.
Summaries are very short (one sentence), consistent with XSUM style.
- Downloads last month
- 15
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support