ThomasBaruzier commited on
Commit
3a4cfce
·
verified ·
1 Parent(s): 9ce8e86

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -7
README.md CHANGED
@@ -10,15 +10,11 @@ tags:
10
  - chat
11
  ---
12
 
13
- <hr>
14
-
15
- # Llama.cpp imatrix quantizations of Qwen/Qwen2.5-72B-Instruct
16
 
17
- <img src="https://cdn-uploads.huggingface.co/production/uploads/646410e04bf9122922289dc7/gDUbZOu1ND0j-th4Q6tep.jpeg" alt="qwen" width="60%"/>
18
 
19
- Using llama.cpp commit [eca0fab](https://github.com/ggerganov/llama.cpp/commit/eca0fab) for quantization.
20
-
21
- Original model: [Qwen/Qwen2.5-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct)
22
 
23
  All quants were made using the imatrix option and Bartowski's [calibration file](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8).
24
 
@@ -26,6 +22,37 @@ All quants were made using the imatrix option and Bartowski's [calibration file]
26
 
27
  # Perplexity table (the lower the better)
28
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
  <hr>
30
 
31
  # Qwen2.5-72B-Instruct
 
10
  - chat
11
  ---
12
 
13
+ <br><img src="https://cdn-uploads.huggingface.co/production/uploads/646410e04bf9122922289dc7/gDUbZOu1ND0j-th4Q6tep.jpeg" width="720"><br>
 
 
14
 
15
+ # Llama.cpp imatrix quantizations of [Qwen/Qwen2.5-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct)
16
 
17
+ Using llama.cpp commit [3ad5451](https://github.com/ggerganov/llama.cpp/commit/3ad5451) for quantization.
 
 
18
 
19
  All quants were made using the imatrix option and Bartowski's [calibration file](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8).
20
 
 
22
 
23
  # Perplexity table (the lower the better)
24
 
25
+ | Quant | Size (MB) | PPL | Size (%) | Accuracy (%) | PPL error rate |
26
+ | ---------------------------------------------------------------------------------------------------------------------- | --------- | --------- | -------- | ------------ | -------------- |
27
+ | [IQ1_S](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ1_S.gguf) | 21639 | 7.6552 | 15.60 | 67.82 | 0.11 |
28
+ | [IQ1_M](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ1_M.gguf) | 22640 | 7.2982 | 16.32 | 71.14 | 0.10 |
29
+ | [IQ2_XXS](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ2_XXS.gguf) | 24309 | 6.3958 | 17.53 | 81.18 | 0.09 |
30
+ | [IQ2_XS](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ2_XS.gguf) | 25804 | 6.0909 | 18.61 | 85.25 | 0.08 |
31
+ | [IQ2_S](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ2_S.gguf) | 26644 | 6.0318 | 19.21 | 86.08 | 0.08 |
32
+ | [IQ2_M](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ2_M.gguf) | 27979 | 5.7589 | 20.17 | 90.16 | 0.08 |
33
+ | [Q2_K_S](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q2_K_S.gguf) | 28199 | 5.9731 | 20.33 | 86.93 | 0.08 |
34
+ | [Q2_K](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q2_K.gguf) | 28430 | 5.9188 | 20.50 | 87.72 | 0.08 |
35
+ | [IQ3_XXS](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ3_XXS.gguf) | 30369 | 5.5227 | 21.90 | 94.01 | 0.07 |
36
+ | [IQ3_XS](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ3_XS.gguf) | 31320 | 5.4357 | 22.58 | 95.52 | 0.07 |
37
+ | [IQ3_S](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ3_S.gguf) | 32890 | 5.3782 | 23.72 | 96.54 | 0.07 |
38
+ | [Q3_K_S](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q3_K_S.gguf) | 32890 | 5.4492 | 23.72 | 95.28 | 0.07 |
39
+ | [IQ3_M](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ3_M.gguf) | 33858 | 5.3550 | 24.41 | 96.96 | 0.07 |
40
+ | [Q3_K_M](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q3_K_M.gguf) | 35952 | 5.4069 | 25.92 | 96.03 | 0.07 |
41
+ | [Q3_K_L](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q3_K_L.gguf) | 37675 | 5.4116 | 27.17 | 95.94 | 0.07 |
42
+ | [IQ4_XS](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ4_XS.gguf) | 37869 | 5.2776 | 27.31 | 98.38 | 0.07 |
43
+ | [IQ4_NL](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ4_NL.gguf) | 39401 | 5.2747 | 28.41 | 98.43 | 0.07 |
44
+ | [Q4_0](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q4_0.gguf) | 39466 | 5.2998 | 28.46 | 97.97 | 0.07 |
45
+ | [Q4_K_S](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q4_K_S.gguf) | 41856 | 5.2535 | 30.18 | 98.83 | 0.07 |
46
+ | [Q4_1](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q4_1.gguf) | 43580 | 5.2801 | 31.42 | 98.33 | 0.07 |
47
+ | [Q4_K_M](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q4_K_M.gguf) | 45219 | 5.2478 | 32.60 | 98.94 | 0.07 |
48
+ | [Q5_0](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q5_0.gguf) | 47984 | 5.2160 | 34.60 | 99.54 | 0.07 |
49
+ | [Q5_K_S](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q5_K_S.gguf) | 48995 | 5.2242 | 35.33 | 99.39 | 0.07 |
50
+ | [Q5_K_M](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q5_K_M.gguf) | 51925 | 5.2182 | 37.44 | 99.50 | 0.07 |
51
+ | [Q5_1](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/tree/main/Qwen2.5-72B-Instruct-Q5_1) | 52099 | 5.2212 | 37.57 | 99.44 | 0.07 |
52
+ | [Q6_K](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/tree/main/Qwen2.5-72B-Instruct-Q6_K) | 61366 | 5.1952 | 44.25 | 99.94 | 0.07 |
53
+ | [Q8_0](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/tree/main/Qwen2.5-72B-Instruct-Q8_0) | 73683 | 5.1944 | 53.13 | 99.96 | 0.07 |
54
+ | [F16](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/tree/main/Qwen2.5-72B-Instruct-F16) | 138685 | 5.1922 | 100 | 100 | 0.07 |
55
+
56
  <hr>
57
 
58
  # Qwen2.5-72B-Instruct