|
|
--- |
|
|
library_name: transformers |
|
|
license: gemma |
|
|
base_model: |
|
|
- google/gemma-3-1b-it |
|
|
--- |
|
|
|
|
|
## Overview |
|
|
|
|
|
[google/gemma-3-1b-it](https://huggingface.co/google/gemma-3-1b-it)をBitsAndBytes(0.44.1)で4bit量子化 |
|
|
|
|
|
量子化の際のコードは以下の通りです。 |
|
|
|
|
|
~~~~python |
|
|
import torch |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig |
|
|
|
|
|
model_id = "google/gemma-3-1b-it" |
|
|
repo_id = "indiebot-community/gemma-3-1b-it-bnb-4bit" |
|
|
|
|
|
bnb_config = BitsAndBytesConfig( |
|
|
load_in_4bit=True, |
|
|
bnb_4bit_quant_type="nf4", |
|
|
bnb_4bit_compute_dtype=torch.bfloat16 |
|
|
) |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map="auto") |
|
|
|
|
|
tokenizer.push_to_hub(repo_id) |
|
|
model.push_to_hub(repo_id) |
|
|
~~~~ |