steampunque commited on
Commit
3acc215
·
verified ·
1 Parent(s): 196d921

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -0
README.md CHANGED
@@ -70,6 +70,27 @@ There is a problem when using q8_0 KV cache format where some heavy computations
70
  gen become unusably slow. This does not happen with f16 kV so it is recommended to stay with f16 kv until/ if this problem gets resolved.
71
  Related discussion in https://github.com/ggml-org/llama.cpp/issues/13747.
72
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
73
  ## Download the file from below:
74
  | Link | Type | Size/e9 B | Notes |
75
  |------|------|-----------|-------|
 
70
  gen become unusably slow. This does not happen with f16 kV so it is recommended to stay with f16 kv until/ if this problem gets resolved.
71
  Related discussion in https://github.com/ggml-org/llama.cpp/issues/13747.
72
 
73
+ Benchmarks:
74
+
75
+ A full set of benchmarks for the model will eventually be given here: https://huggingface.co/spaces/steampunque/benchlm
76
+
77
+ gemma-3-27b-it compares most closely with Mistral-Small-3.1-24B-Instruct-2503 available here:https://huggingface.co/steampunque/Mistral-Small-3.1-24B-Instruct-2503-Hybrid-GGUF .
78
+ A short summary of some key evals comparing the two models is given here for convenience:
79
+
80
+ model | gemma-3-27b-it | Mistral-Small-3.1-24B-Instruct-2503 |
81
+ ------|-----------------|------------|
82
+ quant | Q4_K_H | Q4_K_H |
83
+ alignment | strict | permissive |
84
+ TEST | | |
85
+ Winogrande | 0.748 | 0.784 |
86
+ Lambada | 0.742 | 0.798 |
87
+ Hellaswag | 0.802 | 0.899 |
88
+ BoolQ | 0.701 | 0.646 |
89
+ Jeopardy | 0.830 | 0.740 |
90
+ GSM8K | 0.964 | 0.940 |
91
+ Apple | 0.850 | 0.820 |
92
+ Humaneval | 0.890 | 0.853 |
93
+
94
  ## Download the file from below:
95
  | Link | Type | Size/e9 B | Notes |
96
  |------|------|-----------|-------|