pszemraj
/

long-t5-tglobal-xl-16384-book-summary

text2text-generation

Model card Files Files and versions

pszemraj commited on Dec 15, 2022

Commit

1217a66

·

1 Parent(s): 34f042c

Update README.md

Files changed (1) hide show

README.md +25 -0

README.md CHANGED Viewed

@@ -171,9 +171,34 @@ long_text = "Here is a lot of text I don't want to read. Replace me"
 result = summarizer(long_text)
 print(result[0]["summary_text"])
 ```
 Pass [other parameters related to beam search textgen](https://huggingface.co/blog/how-to-generate) when calling `summarizer` to get even higher quality results.
 ---
 ## About

 result = summarizer(long_text)
 print(result[0]["summary_text"])
 ```
+### beyond the basics
+### decoding performance
 Pass [other parameters related to beam search textgen](https://huggingface.co/blog/how-to-generate) when calling `summarizer` to get even higher quality results.
+### LLM.int8 Quantization
+Per a recent PR LLM.int8 is now supported for `long-t5` models. Per **initial testing** summarization quality appears to hold while requiring _significantly_ less memory! \*
+How-to: essentially ensure you have pip installed from the **latest GitHub repo main** version of `transformers`, and:
+```python
+from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
+tokenizer = AutoTokenizer.from_pretrained("pszemraj/long-t5-tglobal-xl-16384-book-summary")
+model = AutoModelForSeq2SeqLM.from_pretrained(
+          "pszemraj/long-t5-tglobal-xl-16384-book-summary",
+)
+```
+Do you love to ask questions? Awesome. But first, check out the [how LLM.int8 works blog post](https://huggingface.co/blog/hf-bitsandbytes-integration) by huggingface.
+\* More rigorous metric-based investigation into comparing beam-search summarization with and without LLM.int8 will take place over time.
 ---
 ## About