steampunque
/

Mistral-Small-3.1-24B-Instruct-2503-Hybrid-GGUF

4-bit precision

Model card Files Files and versions Community

steampunque commited on Jun 5

Commit

a028397

·

verified ·

1 Parent(s): 5ce9c73

Update README.md

Files changed (1) hide show

README.md +21 -0

README.md CHANGED Viewed

@@ -77,6 +77,27 @@ readme in the tools directory of the source tree https://github.com/ggml-org/lla
 Use of the best available model (Q4_K_H) is recommended to maximize the accuracy of vision mode.  To run it on a 12G VRAM
 GPU use --ngl 32.  Generation speed is still quite good with partial offload.
 ## Download the file from below:
 | Link | Type | Size/e9 B | Notes |
 |------|------|-----------|-------|

 Use of the best available model (Q4_K_H) is recommended to maximize the accuracy of vision mode.  To run it on a 12G VRAM
 GPU use --ngl 32.  Generation speed is still quite good with partial offload.
+Benchmarks:
+A full set of benchmarks for the model will eventually be given here: https://huggingface.co/spaces/steampunque/benchlm
+Mistral-Small-3.1-24B-Instruct-2503 compares most closely with gemma-3-27B-it available here: https://huggingface.co/steampunque/gemma-3-27b-it-Hybrid-GGUF .
+A short summary of some key evals comparing the two models is given here for convenience:
+model  |  gemma-3-27b-it |  Mistral-Small-3.1-24B-Instruct-2503 |
+------|-----------------|------------|
+quant |   Q4_K_H        |    Q4_K_H  |
+alignment | strict      | permissive |
+  TEST    |             |            |
+Winogrande | 0.748      |  0.784     |
+Lambada    | 0.742      |  0.798     |
+Hellaswag  | 0.802      |  0.899     |
+BoolQ      | 0.701      |  0.646     |
+Jeopardy   | 0.830      |  0.740     |
+GSM8K      | 0.964      |  0.940     |
+Apple      | 0.850      |  0.820     |
+Humaneval  | 0.890      |  0.853     |
 ## Download the file from below:
 | Link | Type | Size/e9 B | Notes |
 |------|------|-----------|-------|