bigbird pegasus on the booksum dataset
this is the "latest" version of the model that has been trained the longest, currently at 70k steps
- GOAL: A summarization model that 1) summarizes the source content accurately 2) more important IMO produces summaries that are easy to read and understand (* cough * unlike arXiv * cough *)
- This model attempts to help with that by using the booksum dataset to provide explanatory summarization
- Explanatory Summary - A summary that both consolidates information and also explains why said consolidated information is important.
- This model was trained for seven epochs total (approx 70,000 steps) and is closer to finished.
- Will continue to improve (slowly, now that it has been trained for a long time) based on any result findings/feedback.
- starting checkpoint was
google/bigbird-pegasus-large-bigpatent
example usage
An extended example, including a demo of batch summarization, is here.
- create the summarizer object:
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
from transformers import pipeline
model = AutoModelForSeq2SeqLM.from_pretrained(
"pszemraj/bigbird-pegasus-large-K-booksum",
low_cpu_mem_usage=True,
)
tokenizer = AutoTokenizer.from_pretrained(
"pszemraj/bigbird-pegasus-large-K-booksum",
)
summarizer = pipeline(
"summarization",
model=model,
tokenizer=tokenizer,
)
- define text to be summarized, and pass it through the pipeline. Boom done.
wall_of_text = "your text to be summarized goes here."
result = summarizer(
wall_of_text,
min_length=16,
max_length=256,
no_repeat_ngram_size=3,
clean_up_tokenization_spaces=True,
)
print(result[0]["summary_text"])
Alternate Checkpoint
- if experiencing runtime/memory issues, try this earlier checkpoint at 40,000 steps which is almost as good at the explanatory summarization task but runs faster.
- see similar summarization models fine-tuned on booksum but using different architectures: long-t5 base and LED-Large
- Downloads last month
- 102
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Dataset used to train pszemraj/bigbird-pegasus-large-K-booksum
Evaluation results
- ROUGE-1 on kmfoda/booksumtest set verified34.076
- ROUGE-2 on kmfoda/booksumtest set verified5.918
- ROUGE-L on kmfoda/booksumtest set verified16.387
- ROUGE-LSUM on kmfoda/booksumtest set verified31.612
- loss on kmfoda/booksumtest set verified3.522
- gen_len on kmfoda/booksumtest set verified254.368
- ROUGE-1 on launch/gov_reporttest set verified40.015
- ROUGE-2 on launch/gov_reporttest set verified10.741
- ROUGE-L on launch/gov_reporttest set verified20.134
- ROUGE-LSUM on launch/gov_reporttest set verified36.774