--- language: - en license: mit tags: - summarization - t5-large-summarization - pipeline:summarization model-index: - name: sysresearch101/t5-large-finetuned-xsum results: - task: type: summarization name: Summarization dataset: name: xsum type: xsum config: default split: test metrics: - type: rouge value: 26.8921 name: ROUGE-1 verified: true verifyToken: >- eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZmFkMTFiNmM3YmRkZDk1Y2FhM2EwOTdiYmUwYjBhMGEzZmIyZmIwNWI5OTVmY2U0N2QzYzgxYzM0OTEzMjFjNSIsInZlcnNpb24iOjF9.fOq4zI_BWvTLFJFQOWNk3xEsDIu3aAeboGYPw5TiBqdJJjvdyKmLbfj2WVnNboWbrmp1PuL01iJjTi2Xj6PUAA - type: rouge value: 6.9411 name: ROUGE-2 verified: true verifyToken: >- eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMTBlZmI3NjQ3M2JiYzI4MTg3YmJkMjg0ZmE5MDUwNzljNTYyM2M0NzA3YTNiNTA2Nzk4MDhhYWZjZjgyMmE1MCIsInZlcnNpb24iOjF9.rH0DY2hMz2rXaK29vkt7xah-3G95rY4MOS2oVKjXmw4TijB-ZVytfLJAlBmyqA8HYAythRCywmLSjjCDWc66Cg - type: rouge value: 21.2832 name: ROUGE-L verified: true verifyToken: >- eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiODAwZDYzNTc0NjZhNzNiMDE2ZDY2NjNjNmViNTc0NDVjNTZkYjljODhmYmNiMWFhY2NkZjU5MzQ0NmM0OTcyMSIsInZlcnNpb24iOjF9.5duHtdjZ8dwtbp1HKyMR4mVK9IIlEZvuWGjQMErpE7VNyKPhMOT6Avh_vXFQz6q_jBzqpZGGREho1mt50yBsDw - type: rouge value: 21.284 name: ROUGE-LSUM verified: true verifyToken: >- eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMGQ2NmNhZTZmZDFkNTcyYjQ4MjhhYWJhODY1ZGRjODY2ZTE5MmRmZDRlYTk4NWE4YWM1OWY2M2NjOWQ3YzU0OCIsInZlcnNpb24iOjF9.SJ8xTcAVWrRDmJmQoxE1ADIcdGA4tr3V04Lv0ipMJiUksCdNC7FO8jYbjG9XmiqbDnnr5h4XoK4JB4-GsA-gDA - type: loss value: 2.5411810874938965 name: loss verified: true verifyToken: >- eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZGViNTVlNGI0Njk4NmZmZjExNDBkNTQ4N2FhMzRkZjRjNDNlYzFhZDIyMjJhMmFiM2ZhMTQzYTM4YzNkNWVlNyIsInZlcnNpb24iOjF9.p9n2Kf48k9F9Bkk9j7UKRayvVmOr7_LV80T0ti4lUWFtTsZ91Re841xnEAcKSYgQ9-Bni56ldq9js3kunspJCw - type: gen_len value: 18.7755 name: gen_len verified: true verifyToken: >- eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZmQ1ZWUxNmFjNmU0OGI4MDQyZDNjMWQwZGViNDhlMzE1OGE3YmYwYzZjYmM1NWEwMjk2MDFiMjQ4ZThhMjg5YyIsInZlcnNpb24iOjF9.aNp-NFzBSm84GnXuDtYuHaOsSk7zw8kjCphowYFciwt-aDnhwwurYIr59kMT8JNFMnRInsDi8tvYdapareV3DA datasets: - EdinburghNLP/xsum base_model: - google-t5/t5-large --- # T5-Large Fine-tuned on XSum **Task:** Abstractive Summarization (English) **Base Model:** google-t5/t5-large **License:** MIT ## Overview This model is a T5-Large checkpoint fine-tuned exclusively on the [XSum](https://huggingface.co/datasets/EdinburghNLP/xsum) dataset. It specializes in generating concise, single-sentence summaries in the style of BBC article abstracts. ## Performance ~ On XSum test set | Metric | Score | |--------|-------| | ROUGE-1 | 26.89 | | ROUGE-2 | 6.94 | | ROUGE-L | 21.28 | | Loss | 2.54 | | Avg. Length | 18.77 tokens | ## Usage ### Quick Start ```python from transformers import pipeline summarizer = pipeline("summarization", model="sysresearch101/t5-large-finetuned-xsum") article = "Your article text here..." summary = summarizer(article, max_length=80, min_length=20, do_sample=False) print(summary[0]['summary_text']) ``` ### Advanced Usage ```python from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("sysresearch101/t5-large-finetuned-xsum") model = AutoModelForSeq2SeqLM.from_pretrained("sysresearch101/t5-large-finetuned-xsum") inputs = tokenizer("summarize: " + article, return_tensors="pt", max_length=512, truncation=True) outputs = model.generate( **inputs, max_length=80, min_length=20, num_beams=4, no_repeat_ngram_size=2, length_penalty=1.0, repetition_penalty=2.5, use_cache=True, early_stopping=True do_sample = True, temperature = 0.8, top_k = 50, top_p = 0.95 ) summary = tokenizer.decode(outputs[0], skip_special_tokens=True) ``` ## Training Data - [XSum](https://huggingface.co/datasets/EdinburghNLP/xsum): BBC articles paired with professionally written single-sentence summaries ## Intended Use - **Primary:** Summarization - **Secondary:** Research on extreme summarization, single-sentence summary generation, Educational demonstrations, comparative studies with multi-sentence models - **Not recommended:** Multi-sentence summarization tasks, production use without validation ## Limitations - Trained only on news domain; may not generalize to other text types - Generates very short summaries (average ~19 tokens) - May oversimplify complex topics due to single-sentence constraint ## Citation ```bibtex @misc{stept2023_t5_large_xsum, author = {Shlomo Stept (sysresearch101)}, title = {T5-Large Fine-tuned on XSum for Abstractive Summarization}, year = {2023}, publisher = {Hugging Face}, url = {https://huggingface.co/sysresearch101/t5-large-finetuned-xsum} } ``` ## Papers Using This Model * [Tam et al. (2023). *Evaluating the Factual Consistency of Large Language Models Through Summarization (FIB).* Findings of ACL 2023.](https://arxiv.org/pdf/2211.08412) * [Liu et al. (2024). *LLMs as Narcissistic Evaluators: When Ego Inflates Evaluation Scores.* Findings of ACL 2024.](https://aclanthology.org/2024.findings-acl.753.pdf) * [Zhu et al. (2024). *MTAS: A Reference-Free Approach for Evaluating Abstractive Summarization Systems.* Proceedings of the ACM on SE (FSE 2024).](https://doi.org/10.1145/3660820) ## Contact Created by [Shlomo Stept](https://shlomostept.com) ([ORCID: 0009-0009-3185-589X](https://orcid.org/0009-0009-3185-589X)) DARMIS AI - Website: [shlomostept.com](https://shlomostept.com) - LinkedIn: [linkedin.com/in/shlomo-stept](https://linkedin.com/in/shlomo-stept)