indiebot-community
/

gemma-3-1b-it-bnb-4bit

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions

gemma-3-1b-it-bnb-4bit / README.md

fukugawa's picture

Update README.md

8e885ef verified 7 months ago

|

history blame contribute delete

812 Bytes

	---
	library_name: transformers
	license: gemma
	base_model:
	- google/gemma-3-1b-it
	---

	## Overview

	[google/gemma-3-1b-it](https://huggingface.co/google/gemma-3-1b-it)をBitsAndBytes(0.44.1)で4bit量子化

	量子化の際のコードは以下の通りです。

	~~~~python
	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

	model_id = "google/gemma-3-1b-it"
	repo_id = "indiebot-community/gemma-3-1b-it-bnb-4bit"

	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_quant_type="nf4",
	bnb_4bit_compute_dtype=torch.bfloat16
	)

	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map="auto")

	tokenizer.push_to_hub(repo_id)
	model.push_to_hub(repo_id)
	~~~~