abhishekchohan commited on
Commit
56eadf5
Β·
verified Β·
1 Parent(s): 4171c69

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +59 -0
README.md ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: gemma
3
+ library_name: transformers
4
+ pipeline_tag: image-text-to-text
5
+ extra_gated_heading: Access Gemma on Hugging Face
6
+ extra_gated_prompt: To access Gemma on Hugging Face, you're required to review and agree to Google's usage license. To do this, please ensure you're logged in to Hugging Face and click below. Requests are processed immediately.
7
+ extra_gated_button_content: Acknowledge license
8
+ base_model: google/gemma-3-27b-it
9
+ ---
10
+ # Gemma 3 Quantized Models
11
+
12
+ This repository contains W4A16 quantized versions of Google's Gemma 3 instruction-tuned models, making them more accessible for deployment on consumer hardware while maintaining good performance.
13
+
14
+ ## Models
15
+
16
+ - **abhishekchohan/gemma-3-27b-it-quantized-W4A16**
17
+ - **abhishekchohan/gemma-3-12b-it-quantized-W4A16**
18
+ - **abhishekchohan/gemma-3-4b-it-quantized-W4A16**
19
+
20
+ ## Repository Structure
21
+
22
+ ```
23
+ gemma-3-{size}-it-quantized-W4A16/
24
+ β”œβ”€β”€ README.md
25
+ β”œβ”€β”€ templates/
26
+ β”‚ └── chat_template.jinja
27
+ β”œβ”€β”€ tools/
28
+ β”‚ └── tool_parser.py
29
+ └── [model files]
30
+ ```
31
+
32
+ ## Quantization Details
33
+
34
+ These models use W4A16 quantization via LLM Compressor:
35
+ - Weights quantized to 4-bit precision
36
+ - Activations use 16-bit precision
37
+ - Significantly reduced memory requirements
38
+
39
+ ## Usage with vLLM
40
+
41
+ ```bash
42
+ vllm serve abhishekchohan/gemma-3-{size}-it-quantized-W4A16 --chat-template templates/chat_template.jinja --enable-auto-tool-choice --tool-call-parser gemma --tool-parser-plugin tools/tool_parser.py
43
+ ```
44
+
45
+ ## License
46
+
47
+ These models are subject to the Gemma license. Users must acknowledge and accept the license terms before using the models.
48
+
49
+ ## Citation
50
+
51
+ ```
52
+ @article{gemma_2025,
53
+ title={Gemma 3},
54
+ url={https://goo.gle/Gemma3Report},
55
+ publisher={Kaggle},
56
+ author={Gemma Team},
57
+ year={2025}
58
+ }
59
+ ```