steampunque
/

gemma-3-27b-it-Hybrid-GGUF

4-bit precision

Model card Files Files and versions Community

steampunque commited on 15 days ago

Commit

3acc215

·

verified ·

1 Parent(s): 196d921

Update README.md

Files changed (1) hide show

README.md +21 -0

README.md CHANGED Viewed

@@ -70,6 +70,27 @@ There is a problem when using q8_0 KV cache format where some heavy computations
 gen become unusably slow.  This does not happen with f16 kV so it is recommended to stay with f16 kv until/ if this problem gets resolved.
 Related discussion in https://github.com/ggml-org/llama.cpp/issues/13747.
 ## Download the file from below:
 | Link | Type | Size/e9 B | Notes |
 |------|------|-----------|-------|

 gen become unusably slow.  This does not happen with f16 kV so it is recommended to stay with f16 kv until/ if this problem gets resolved.
 Related discussion in https://github.com/ggml-org/llama.cpp/issues/13747.
+Benchmarks:
+A full set of benchmarks for the model will eventually be given here: https://huggingface.co/spaces/steampunque/benchlm
+gemma-3-27b-it compares most closely with Mistral-Small-3.1-24B-Instruct-2503 available here:https://huggingface.co/steampunque/Mistral-Small-3.1-24B-Instruct-2503-Hybrid-GGUF .
+A short summary of some key evals comparing the two models is given here for convenience:
+model  |  gemma-3-27b-it |  Mistral-Small-3.1-24B-Instruct-2503 |
+------|-----------------|------------|
+quant |   Q4_K_H        |    Q4_K_H  |
+alignment | strict      | permissive |
+  TEST    |             |            |
+Winogrande | 0.748      |  0.784     |
+Lambada    | 0.742      |  0.798     |
+Hellaswag  | 0.802      |  0.899     |
+BoolQ      | 0.701      |  0.646     |
+Jeopardy   | 0.830      |  0.740     |
+GSM8K      | 0.964      |  0.940     |
+Apple      | 0.850      |  0.820     |
+Humaneval  | 0.890      |  0.853     |
 ## Download the file from below:
 | Link | Type | Size/e9 B | Notes |
 |------|------|-----------|-------|