File size: 812 Bytes
d3fd215
 
6119598
 
 
d3fd215
 
6119598
d3fd215
8e885ef
d3fd215
6119598
d3fd215
6119598
 
 
d3fd215
6119598
 
d3fd215
6119598
 
 
 
 
d3fd215
6119598
 
d3fd215
6119598
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
---
library_name: transformers
license: gemma
base_model:
- google/gemma-3-1b-it
---

## Overview

[google/gemma-3-1b-it](https://huggingface.co/google/gemma-3-1b-it)をBitsAndBytes(0.44.1)で4bit量子化

量子化の際のコードは以下の通りです。

~~~~python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

model_id = "google/gemma-3-1b-it" 
repo_id = "indiebot-community/gemma-3-1b-it-bnb-4bit"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map="auto")

tokenizer.push_to_hub(repo_id)
model.push_to_hub(repo_id)
~~~~