metadata
language:
- en
license: mit
tags:
- summarization
- t5-large-summarization
- pipeline:summarization
model-index:
- name: sysresearch101/t5-large-finetuned-xsum
results:
- task:
type: summarization
name: Summarization
dataset:
name: xsum
type: xsum
config: default
split: test
metrics:
- type: rouge
value: 26.8921
name: ROUGE-1
verified: true
verifyToken: >-
eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZmFkMTFiNmM3YmRkZDk1Y2FhM2EwOTdiYmUwYjBhMGEzZmIyZmIwNWI5OTVmY2U0N2QzYzgxYzM0OTEzMjFjNSIsInZlcnNpb24iOjF9.fOq4zI_BWvTLFJFQOWNk3xEsDIu3aAeboGYPw5TiBqdJJjvdyKmLbfj2WVnNboWbrmp1PuL01iJjTi2Xj6PUAA
- type: rouge
value: 6.9411
name: ROUGE-2
verified: true
verifyToken: >-
eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMTBlZmI3NjQ3M2JiYzI4MTg3YmJkMjg0ZmE5MDUwNzljNTYyM2M0NzA3YTNiNTA2Nzk4MDhhYWZjZjgyMmE1MCIsInZlcnNpb24iOjF9.rH0DY2hMz2rXaK29vkt7xah-3G95rY4MOS2oVKjXmw4TijB-ZVytfLJAlBmyqA8HYAythRCywmLSjjCDWc66Cg
- type: rouge
value: 21.2832
name: ROUGE-L
verified: true
verifyToken: >-
eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiODAwZDYzNTc0NjZhNzNiMDE2ZDY2NjNjNmViNTc0NDVjNTZkYjljODhmYmNiMWFhY2NkZjU5MzQ0NmM0OTcyMSIsInZlcnNpb24iOjF9.5duHtdjZ8dwtbp1HKyMR4mVK9IIlEZvuWGjQMErpE7VNyKPhMOT6Avh_vXFQz6q_jBzqpZGGREho1mt50yBsDw
- type: rouge
value: 21.284
name: ROUGE-LSUM
verified: true
verifyToken: >-
eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMGQ2NmNhZTZmZDFkNTcyYjQ4MjhhYWJhODY1ZGRjODY2ZTE5MmRmZDRlYTk4NWE4YWM1OWY2M2NjOWQ3YzU0OCIsInZlcnNpb24iOjF9.SJ8xTcAVWrRDmJmQoxE1ADIcdGA4tr3V04Lv0ipMJiUksCdNC7FO8jYbjG9XmiqbDnnr5h4XoK4JB4-GsA-gDA
- type: loss
value: 2.5411810874938965
name: loss
verified: true
verifyToken: >-
eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZGViNTVlNGI0Njk4NmZmZjExNDBkNTQ4N2FhMzRkZjRjNDNlYzFhZDIyMjJhMmFiM2ZhMTQzYTM4YzNkNWVlNyIsInZlcnNpb24iOjF9.p9n2Kf48k9F9Bkk9j7UKRayvVmOr7_LV80T0ti4lUWFtTsZ91Re841xnEAcKSYgQ9-Bni56ldq9js3kunspJCw
- type: gen_len
value: 18.7755
name: gen_len
verified: true
verifyToken: >-
eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZmQ1ZWUxNmFjNmU0OGI4MDQyZDNjMWQwZGViNDhlMzE1OGE3YmYwYzZjYmM1NWEwMjk2MDFiMjQ4ZThhMjg5YyIsInZlcnNpb24iOjF9.aNp-NFzBSm84GnXuDtYuHaOsSk7zw8kjCphowYFciwt-aDnhwwurYIr59kMT8JNFMnRInsDi8tvYdapareV3DA
datasets:
- EdinburghNLP/xsum
base_model:
- google-t5/t5-large
T5-Large Fine-tuned on XSum
Task: Abstractive Summarization (English)
Base Model: google-t5/t5-large
License: MIT
Overview
This model is a T5-Large checkpoint fine-tuned exclusively on the XSum dataset. It specializes in generating concise, single-sentence summaries in the style of BBC article abstracts.
Performance ~ On XSum test set
| Metric | Score |
|---|---|
| ROUGE-1 | 26.89 |
| ROUGE-2 | 6.94 |
| ROUGE-L | 21.28 |
| Loss | 2.54 |
| Avg. Length | 18.77 tokens |
Usage
Quick Start
from transformers import pipeline
summarizer = pipeline("summarization", model="sysresearch101/t5-large-finetuned-xsum")
article = "Your article text here..."
summary = summarizer(article, max_length=80, min_length=20, do_sample=False)
print(summary[0]['summary_text'])
Advanced Usage
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("sysresearch101/t5-large-finetuned-xsum")
model = AutoModelForSeq2SeqLM.from_pretrained("sysresearch101/t5-large-finetuned-xsum")
inputs = tokenizer("summarize: " + article, return_tensors="pt", max_length=512, truncation=True)
outputs = model.generate(
**inputs,
max_length=80,
min_length=20,
num_beams=4,
no_repeat_ngram_size=2,
length_penalty=1.0,
repetition_penalty=2.5,
use_cache=True,
early_stopping=True
do_sample = True,
temperature = 0.8,
top_k = 50,
top_p = 0.95
)
summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
Training Data
- XSum: BBC articles paired with professionally written single-sentence summaries
Intended Use
- Primary: Summarization
- Secondary: Research on extreme summarization, single-sentence summary generation, Educational demonstrations, comparative studies with multi-sentence models
- Not recommended: Multi-sentence summarization tasks, production use without validation
Limitations
- Trained only on news domain; may not generalize to other text types
- Generates very short summaries (average ~19 tokens)
- May oversimplify complex topics due to single-sentence constraint
Citation
@misc{stept2023_t5_large_xsum,
author = {Shlomo Stept (sysresearch101)},
title = {T5-Large Fine-tuned on XSum for Abstractive Summarization},
year = {2023},
publisher = {Hugging Face},
url = {https://huggingface.co/sysresearch101/t5-large-finetuned-xsum}
}
Papers Using This Model
- Tam et al. (2023). Evaluating the Factual Consistency of Large Language Models Through Summarization (FIB). Findings of ACL 2023.
- Liu et al. (2024). LLMs as Narcissistic Evaluators: When Ego Inflates Evaluation Scores. Findings of ACL 2024.
- Zhu et al. (2024). MTAS: A Reference-Free Approach for Evaluating Abstractive Summarization Systems. Proceedings of the ACM on SE (FSE 2024).
Contact
Created by Shlomo Stept (ORCID: 0009-0009-3185-589X) DARMIS AI
- Website: shlomostept.com
- LinkedIn: linkedin.com/in/shlomo-stept