steampunque commited on
Commit
a028397
·
verified ·
1 Parent(s): 5ce9c73

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -0
README.md CHANGED
@@ -77,6 +77,27 @@ readme in the tools directory of the source tree https://github.com/ggml-org/lla
77
  Use of the best available model (Q4_K_H) is recommended to maximize the accuracy of vision mode. To run it on a 12G VRAM
78
  GPU use --ngl 32. Generation speed is still quite good with partial offload.
79
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
80
  ## Download the file from below:
81
  | Link | Type | Size/e9 B | Notes |
82
  |------|------|-----------|-------|
 
77
  Use of the best available model (Q4_K_H) is recommended to maximize the accuracy of vision mode. To run it on a 12G VRAM
78
  GPU use --ngl 32. Generation speed is still quite good with partial offload.
79
 
80
+ Benchmarks:
81
+
82
+ A full set of benchmarks for the model will eventually be given here: https://huggingface.co/spaces/steampunque/benchlm
83
+
84
+ Mistral-Small-3.1-24B-Instruct-2503 compares most closely with gemma-3-27B-it available here: https://huggingface.co/steampunque/gemma-3-27b-it-Hybrid-GGUF .
85
+ A short summary of some key evals comparing the two models is given here for convenience:
86
+
87
+ model | gemma-3-27b-it | Mistral-Small-3.1-24B-Instruct-2503 |
88
+ ------|-----------------|------------|
89
+ quant | Q4_K_H | Q4_K_H |
90
+ alignment | strict | permissive |
91
+ TEST | | |
92
+ Winogrande | 0.748 | 0.784 |
93
+ Lambada | 0.742 | 0.798 |
94
+ Hellaswag | 0.802 | 0.899 |
95
+ BoolQ | 0.701 | 0.646 |
96
+ Jeopardy | 0.830 | 0.740 |
97
+ GSM8K | 0.964 | 0.940 |
98
+ Apple | 0.850 | 0.820 |
99
+ Humaneval | 0.890 | 0.853 |
100
+
101
  ## Download the file from below:
102
  | Link | Type | Size/e9 B | Notes |
103
  |------|------|-----------|-------|