ThomasBaruzier
/

Qwen2.5-72B-Instruct-GGUF

Text Generation

GGUF

chat

imatrix

conversational

Model card Files Files and versions Community

ThomasBaruzier commited on Feb 22

Commit

3a4cfce

verified ·

1 Parent(s): 9ce8e86

Update README.md

Browse files

Files changed (1) hide show

README.md +34 -7

README.md CHANGED Viewed

@@ -10,15 +10,11 @@ tags:
 - chat
 ---
-<hr>
-# Llama.cpp imatrix quantizations of Qwen/Qwen2.5-72B-Instruct
-<img src="https://cdn-uploads.huggingface.co/production/uploads/646410e04bf9122922289dc7/gDUbZOu1ND0j-th4Q6tep.jpeg" alt="qwen" width="60%"/>
-Using llama.cpp commit [eca0fab](https://github.com/ggerganov/llama.cpp/commit/eca0fab) for quantization.
-Original model: [Qwen/Qwen2.5-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct)
 All quants were made using the imatrix option and Bartowski's [calibration file](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8).
@@ -26,6 +22,37 @@ All quants were made using the imatrix option and Bartowski's [calibration file]
 # Perplexity table (the lower the better)
 <hr>
 # Qwen2.5-72B-Instruct

 - chat
 ---
+<br><img src="https://cdn-uploads.huggingface.co/production/uploads/646410e04bf9122922289dc7/gDUbZOu1ND0j-th4Q6tep.jpeg" width="720"><br>
+# Llama.cpp imatrix quantizations of [Qwen/Qwen2.5-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct)
+Using llama.cpp commit [3ad5451](https://github.com/ggerganov/llama.cpp/commit/3ad5451) for quantization.
 All quants were made using the imatrix option and Bartowski's [calibration file](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8).
 # Perplexity table (the lower the better)
+| Quant                                                                                                                  | Size (MB) | PPL       | Size (%) | Accuracy (%) | PPL error rate |
+| ---------------------------------------------------------------------------------------------------------------------- | --------- | --------- | -------- | ------------ | -------------- |
+| [IQ1_S](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ1_S.gguf)     | 21639     | 7.6552    | 15.60    | 67.82        | 0.11           |
+| [IQ1_M](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ1_M.gguf)     | 22640     | 7.2982    | 16.32    | 71.14        | 0.10           |
+| [IQ2_XXS](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ2_XXS.gguf) | 24309     | 6.3958    | 17.53    | 81.18        | 0.09           |
+| [IQ2_XS](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ2_XS.gguf)   | 25804     | 6.0909    | 18.61    | 85.25        | 0.08           |
+| [IQ2_S](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ2_S.gguf)     | 26644     | 6.0318    | 19.21    | 86.08        | 0.08           |
+| [IQ2_M](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ2_M.gguf)     | 27979     | 5.7589    | 20.17    | 90.16        | 0.08           |
+| [Q2_K_S](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q2_K_S.gguf)   | 28199     | 5.9731    | 20.33    | 86.93        | 0.08           |
+| [Q2_K](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q2_K.gguf)       | 28430     | 5.9188    | 20.50    | 87.72        | 0.08           |
+| [IQ3_XXS](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ3_XXS.gguf) | 30369     | 5.5227    | 21.90    | 94.01        | 0.07           |
+| [IQ3_XS](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ3_XS.gguf)   | 31320     | 5.4357    | 22.58    | 95.52        | 0.07           |
+| [IQ3_S](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ3_S.gguf)     | 32890     | 5.3782    | 23.72    | 96.54        | 0.07           |
+| [Q3_K_S](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q3_K_S.gguf)   | 32890     | 5.4492    | 23.72    | 95.28        | 0.07           |
+| [IQ3_M](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ3_M.gguf)     | 33858     | 5.3550    | 24.41    | 96.96        | 0.07           |
+| [Q3_K_M](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q3_K_M.gguf)   | 35952     | 5.4069    | 25.92    | 96.03        | 0.07           |
+| [Q3_K_L](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q3_K_L.gguf)   | 37675     | 5.4116    | 27.17    | 95.94        | 0.07           |
+| [IQ4_XS](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ4_XS.gguf)   | 37869     | 5.2776    | 27.31    | 98.38        | 0.07           |
+| [IQ4_NL](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-IQ4_NL.gguf)   | 39401     | 5.2747    | 28.41    | 98.43        | 0.07           |
+| [Q4_0](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q4_0.gguf)       | 39466     | 5.2998    | 28.46    | 97.97        | 0.07           |
+| [Q4_K_S](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q4_K_S.gguf)   | 41856     | 5.2535    | 30.18    | 98.83        | 0.07           |
+| [Q4_1](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q4_1.gguf)       | 43580     | 5.2801    | 31.42    | 98.33        | 0.07           |
+| [Q4_K_M](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q4_K_M.gguf)   | 45219     | 5.2478    | 32.60    | 98.94        | 0.07           |
+| [Q5_0](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q5_0.gguf)       | 47984     | 5.2160    | 34.60    | 99.54        | 0.07           |
+| [Q5_K_S](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q5_K_S.gguf)   | 48995     | 5.2242    | 35.33    | 99.39        | 0.07           |
+| [Q5_K_M](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/blob/main/Qwen2.5-72B-Instruct-Q5_K_M.gguf)   | 51925     | 5.2182    | 37.44    | 99.50        | 0.07           |
+| [Q5_1](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/tree/main/Qwen2.5-72B-Instruct-Q5_1)            | 52099     | 5.2212    | 37.57    | 99.44        | 0.07           |
+| [Q6_K](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/tree/main/Qwen2.5-72B-Instruct-Q6_K)            | 61366     | 5.1952    | 44.25    | 99.94        | 0.07           |
+| [Q8_0](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/tree/main/Qwen2.5-72B-Instruct-Q8_0)            | 73683     | 5.1944    | 53.13    | 99.96        | 0.07           |
+| [F16](https://huggingface.co/ThomasBaruzier/Qwen2.5-72B-Instruct-GGUF/tree/main/Qwen2.5-72B-Instruct-F16)              | 138685    | 5.1922    | 100      | 100          | 0.07           |
 <hr>
 # Qwen2.5-72B-Instruct