BLOOM LM - 8bit
BigScience Large Open-science Open-access Multilingual Language Model - 8bit
Model Card
Version 1.0 / 26.May.2022
Related paper: https://arxiv.org/abs/2208.07339
TL;DR
This repository contains 8bit weights of bloom-1b7
model. You can load this model using transformers==4.28.0
and bitsandbytes>0.37.2
out of the box !
# pip install accelerate bitsandbytes
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("ybelkada/bloom-1b7-8bit")
How to push 8bit weights?
First, make sure you are using transformers
& bitsandbytes
versions stated above. Then load your 8bit model as usual using load_in_8bit=True
!
# pip install accelerate bitsandbytes
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("bigscience/bloom-1b7", device_map="auto", load_in_8bit=True)
Then just call push_to_hub
method or save_pretrained
method if you want to save your 8bit model locally
model.push_to_hub("{your_username}/bloom-1b7-8bit")
That's it!
What is inside the model's state_dict
?
Inside the state dict of the model (pytorch_model.bin
file) you have
- the quantized
int8
weights - the quantization statistics in
float16
- Downloads last month
- 538
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.