News Summarizer
This model is fine-tuned for news article summarization. It can take long news articles and generate concise, accurate summaries.
Model Details
- Base Model: facebook/bart-large-cnn
- Task: Text Summarization
- Language: English
- Training Steps: 4000
- Best ROUGE-1: 0.42
- Live version on Streamlit: https://english-news-summarizer.streamlit.app
Usage
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
import re
# Load model
model = AutoModelForSeq2SeqLM.from_pretrained("ciorant/news-summarizer")
tokenizer = AutoTokenizer.from_pretrained("ciorant/news-summarizer")
def summarize_news(article_text, max_length=128):
inputs = tokenizer(article_text, return_tensors="pt", truncation=True, max_length=512)
outputs = model.generate(
inputs.input_ids,
max_length=max_length,
num_beams=4,
early_stopping=True,
do_sample=False,
length_penalty=1.0
)
summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
# Clean up spacing around punctuation
summary = re.sub(r'\s+([.,!?;:])', r'\1', summary)
summary = re.sub(r'\s+', ' ', summary)
return summary.strip()
# Example usage
article = "Your news article text here..."
summary = summarize_news(article)
print(summary)
Training Data
Trained on news articles for summarization task.
Performance
- ROUGE-1: ~0.42
- ROUGE-2: ~0.21
- ROUGE-L: ~0.29
Limitations
- Optimized for English news articles
- Best performance on articles 100-800 words
- Downloads last month
- 18
Model tree for ciorant/news-summarizer
Base model
google/flan-t5-base