Update README.md
Browse files
README.md
CHANGED
@@ -10,15 +10,11 @@ tags:
|
|
10 |
- chat
|
11 |
---
|
12 |
|
13 |
-
<
|
14 |
-
|
15 |
-
# Llama.cpp imatrix quantizations of Qwen/Qwen2.5-72B-Instruct
|
16 |
|
17 |
-
|
18 |
|
19 |
-
Using llama.cpp commit [
|
20 |
-
|
21 |
-
Original model: [Qwen/Qwen2.5-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct)
|
22 |
|
23 |
All quants were made using the imatrix option and Bartowski's [calibration file](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8).
|
24 |
|
@@ -26,6 +22,37 @@ All quants were made using the imatrix option and Bartowski's [calibration file]
|
|
26 |
|
27 |
# Perplexity table (the lower the better)
|
28 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
29 |
<hr>
|
30 |
|
31 |
# Qwen2.5-72B-Instruct
|
|
|
10 |
- chat
|
11 |
---
|
12 |
|
13 |
+
<br><img src="https://cdn-uploads.huggingface.co/production/uploads/646410e04bf9122922289dc7/gDUbZOu1ND0j-th4Q6tep.jpeg" width="720"><br>
|
|
|
|
|
14 |
|
15 |
+
# Llama.cpp imatrix quantizations of [Qwen/Qwen2.5-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct)
|
16 |
|
17 |
+
Using llama.cpp commit [3ad5451](https://github.com/ggerganov/llama.cpp/commit/3ad5451) for quantization.
|
|
|
|
|
18 |
|
19 |
All quants were made using the imatrix option and Bartowski's [calibration file](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8).
|
20 |
|
|
|
22 |
|
23 |
# Perplexity table (the lower the better)
|
24 |
|
25 |
+
| Quant | Size (MB) | PPL | Size (%) | Accuracy (%) | PPL error rate |
|
26 |
+
| ---------------------------------------------------------------------------------------------------------------------- | --------- | --------- | -------- | ------------ | -------------- |
|
27 |
+
| [IQ1_S](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ1_S.gguf) | 21639 | 7.6552 | 15.60 | 67.82 | 0.11 |
|
28 |
+
| [IQ1_M](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ1_M.gguf) | 22640 | 7.2982 | 16.32 | 71.14 | 0.10 |
|
29 |
+
| [IQ2_XXS](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ2_XXS.gguf) | 24309 | 6.3958 | 17.53 | 81.18 | 0.09 |
|
30 |
+
| [IQ2_XS](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ2_XS.gguf) | 25804 | 6.0909 | 18.61 | 85.25 | 0.08 |
|
31 |
+
| [IQ2_S](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ2_S.gguf) | 26644 | 6.0318 | 19.21 | 86.08 | 0.08 |
|
32 |
+
| [IQ2_M](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ2_M.gguf) | 27979 | 5.7589 | 20.17 | 90.16 | 0.08 |
|
33 |
+
| [Q2_K_S](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q2_K_S.gguf) | 28199 | 5.9731 | 20.33 | 86.93 | 0.08 |
|
34 |
+
| [Q2_K](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q2_K.gguf) | 28430 | 5.9188 | 20.50 | 87.72 | 0.08 |
|
35 |
+
| [IQ3_XXS](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ3_XXS.gguf) | 30369 | 5.5227 | 21.90 | 94.01 | 0.07 |
|
36 |
+
| [IQ3_XS](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ3_XS.gguf) | 31320 | 5.4357 | 22.58 | 95.52 | 0.07 |
|
37 |
+
| [IQ3_S](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ3_S.gguf) | 32890 | 5.3782 | 23.72 | 96.54 | 0.07 |
|
38 |
+
| [Q3_K_S](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q3_K_S.gguf) | 32890 | 5.4492 | 23.72 | 95.28 | 0.07 |
|
39 |
+
| [IQ3_M](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ3_M.gguf) | 33858 | 5.3550 | 24.41 | 96.96 | 0.07 |
|
40 |
+
| [Q3_K_M](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q3_K_M.gguf) | 35952 | 5.4069 | 25.92 | 96.03 | 0.07 |
|
41 |
+
| [Q3_K_L](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q3_K_L.gguf) | 37675 | 5.4116 | 27.17 | 95.94 | 0.07 |
|
42 |
+
| [IQ4_XS](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ4_XS.gguf) | 37869 | 5.2776 | 27.31 | 98.38 | 0.07 |
|
43 |
+
| [IQ4_NL](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ4_NL.gguf) | 39401 | 5.2747 | 28.41 | 98.43 | 0.07 |
|
44 |
+
| [Q4_0](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q4_0.gguf) | 39466 | 5.2998 | 28.46 | 97.97 | 0.07 |
|
45 |
+
| [Q4_K_S](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q4_K_S.gguf) | 41856 | 5.2535 | 30.18 | 98.83 | 0.07 |
|
46 |
+
| [Q4_1](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q4_1.gguf) | 43580 | 5.2801 | 31.42 | 98.33 | 0.07 |
|
47 |
+
| [Q4_K_M](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q4_K_M.gguf) | 45219 | 5.2478 | 32.60 | 98.94 | 0.07 |
|
48 |
+
| [Q5_0](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q5_0.gguf) | 47984 | 5.2160 | 34.60 | 99.54 | 0.07 |
|
49 |
+
| [Q5_K_S](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q5_K_S.gguf) | 48995 | 5.2242 | 35.33 | 99.39 | 0.07 |
|
50 |
+
| [Q5_K_M](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q5_K_M.gguf) | 51925 | 5.2182 | 37.44 | 99.50 | 0.07 |
|
51 |
+
| [Q5_1](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/tree/main/Qwen2.5-72B-Instruct-Q5_1) | 52099 | 5.2212 | 37.57 | 99.44 | 0.07 |
|
52 |
+
| [Q6_K](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/tree/main/Qwen2.5-72B-Instruct-Q6_K) | 61366 | 5.1952 | 44.25 | 99.94 | 0.07 |
|
53 |
+
| [Q8_0](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/tree/main/Qwen2.5-72B-Instruct-Q8_0) | 73683 | 5.1944 | 53.13 | 99.96 | 0.07 |
|
54 |
+
| [F16](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/tree/main/Qwen2.5-72B-Instruct-F16) | 138685 | 5.1922 | 100 | 100 | 0.07 |
|
55 |
+
|
56 |
<hr>
|
57 |
|
58 |
# Qwen2.5-72B-Instruct
|