Update README.md
Browse files
README.md
CHANGED
@@ -5,6 +5,18 @@ base_model:
|
|
5 |
---
|
6 |
This is an HQQ-quantized version (4-bit, group-size=64) of the <a href="https://huggingface.co/google/gemma-3-12b-it">gemma-3-12b-it</a> model.
|
7 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
8 |
## Usage
|
9 |
```Python
|
10 |
#use transformers up to 52cc204dd7fbd671452448028aae6262cea74dc2
|
|
|
5 |
---
|
6 |
This is an HQQ-quantized version (4-bit, group-size=64) of the <a href="https://huggingface.co/google/gemma-3-12b-it">gemma-3-12b-it</a> model.
|
7 |
|
8 |
+
## Performance
|
9 |
+
|
10 |
+
| Models | <a href="https://huggingface.co/google/gemma-3-12b-it">bfp16</a> | <a href="https://huggingface.co/mobiuslabsgmbh/gemma-3-12b-it_4bitgs64_bfp16_hqq_hf">HQQ 4-bit gs-64</a> | <a href="https://huggingface.co/gaunernst/gemma-3-12b-it-int4-awq">QAT 4-bit gs-32</a> |
|
11 |
+
|:-------------------:|:--------:|:--------:|:--------:|
|
12 |
+
| ARC (25-shot) | 0.724 | 0.701 | 0.690 |
|
13 |
+
| HellaSwag (10-shot)| 0.839 | 0.826 | 0.792 |
|
14 |
+
| MMLU (5-shot) | 0.730 | 0.724 | 0.693 |
|
15 |
+
| TruthfulQA-MC2 | 0.580 | 0.585 | 0.550 |
|
16 |
+
| Winogrande (5-shot)| 0.766 | 0.774 | 0.755 |
|
17 |
+
| GSM8K (5-shot) | 0.874 | 0.862 | 0.808 |
|
18 |
+
| Average | 0.752 | 0.745 | 0.715 |
|
19 |
+
|
20 |
## Usage
|
21 |
```Python
|
22 |
#use transformers up to 52cc204dd7fbd671452448028aae6262cea74dc2
|