Update README.md
Browse files
README.md
CHANGED
@@ -1,132 +0,0 @@
|
|
1 |
-
---
|
2 |
-
license: mit
|
3 |
-
train: false
|
4 |
-
inference: true
|
5 |
-
pipeline_tag: text-generation
|
6 |
-
base_model:
|
7 |
-
- deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
|
8 |
-
---
|
9 |
-
|
10 |
-
<br><img src="https://cdn-uploads.huggingface.co/production/uploads/646410e04bf9122922289dc7/FHc3IG1KAJn6N3s1TJLrS.webp" width="720"><br>
|
11 |
-
|
12 |
-
# Llama.cpp imatrix quantizations of [mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-7B-v1.1](https://huggingface.co/mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-7B-v1.1)
|
13 |
-
|
14 |
-
Using llama.cpp commit [3ad5451](https://github.com/ggerganov/llama.cpp/commit/3ad5451) for quantization.
|
15 |
-
|
16 |
-
All quants were made using the imatrix option and Bartowski's [calibration file](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8).
|
17 |
-
|
18 |
-
<hr>
|
19 |
-
|
20 |
-
# Perplexity table (the lower the better)
|
21 |
-
|
22 |
-
| Quant | Size (MB) | PPL | Size (%) | Accuracy (%) | PPL error rate |
|
23 |
-
| ------------------------------------------------------------------------------------------------------------------------------------------------------ | --------- | ------- | -------- | ------------ | -------------- |
|
24 |
-
| [IQ1_S](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-IQ1_S.gguf) | 489 | 88.4250 | 14.40 | 23.35 | 1.76 |
|
25 |
-
| [IQ1_M](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-IQ1_M.gguf) | 516 | 53.8278 | 15.19 | 38.35 | 1.03 |
|
26 |
-
| [IQ2_XXS](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-IQ2_XXS.gguf) | 560 | 45.5693 | 16.49 | 45.31 | 0.93 |
|
27 |
-
| [IQ2_XS](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-IQ2_XS.gguf) | 598 | 32.6813 | 17.61 | 63.17 | 0.62 |
|
28 |
-
| [IQ2_S](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-IQ2_S.gguf) | 633 | 28.5477 | 18.64 | 72.32 | 0.54 |
|
29 |
-
| [IQ2_M](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-IQ2_M.gguf) | 669 | 31.8272 | 19.70 | 64.87 | 0.63 |
|
30 |
-
| [Q2_K_S](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q2_K_S.gguf) | 683 | 28.7707 | 20.11 | 71.76 | 0.54 |
|
31 |
-
| [Q2_K](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q2_K.gguf) | 718 | 27.6342 | 21.14 | 74.71 | 0.51 |
|
32 |
-
| [IQ3_XXS](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-IQ3_XXS.gguf) | 733 | 23.5511 | 21.58 | 87.66 | 0.44 |
|
33 |
-
| [IQ3_XS](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-IQ3_XS.gguf) | 793 | 22.9887 | 23.35 | 89.81 | 0.42 |
|
34 |
-
| [Q3_K_S](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q3_K_S.gguf) | 821 | 28.0462 | 24.17 | 73.61 | 0.53 |
|
35 |
-
| [IQ3_S](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-IQ3_S.gguf) | 822 | 22.9268 | 24.20 | 90.05 | 0.42 |
|
36 |
-
| [IQ3_M](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-IQ3_M.gguf) | 836 | 22.3167 | 24.62 | 92.51 | 0.41 |
|
37 |
-
| [Q3_K_M](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q3_K_M.gguf) | 881 | 22.5727 | 25.94 | 91.46 | 0.41 |
|
38 |
-
| [Q3_K_L](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q3_K_L.gguf) | 935 | 22.3758 | 27.53 | 92.27 | 0.41 |
|
39 |
-
| [IQ4_XS](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-IQ4_XS.gguf) | 972 | 21.3273 | 28.62 | 96.80 | 0.38 |
|
40 |
-
| [IQ4_NL](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-IQ4_NL.gguf) | 1018 | 21.3234 | 29.98 | 96.82 | 0.38 |
|
41 |
-
| [Q4_0](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q4_0.gguf) | 1019 | 22.5210 | 30.00 | 91.67 | 0.41 |
|
42 |
-
| [Q4_K_S](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q4_K_S.gguf) | 1022 | 21.1717 | 30.09 | 97.51 | 0.38 |
|
43 |
-
| [Q4_K_M](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q4_K_M.gguf) | 1065 | 21.0532 | 31.36 | 98.06 | 0.38 |
|
44 |
-
| [Q4_1](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q4_1.gguf) | 1109 | 21.1492 | 32.66 | 97.62 | 0.38 |
|
45 |
-
| [Q5_K_S](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q5_K_S.gguf) | 1201 | 20.7883 | 35.37 | 99.31 | 0.37 |
|
46 |
-
| [Q5_0](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q5_0.gguf) | 1203 | 20.8643 | 35.42 | 98.95 | 0.37 |
|
47 |
-
| [Q5_K_M](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q5_K_M.gguf) | 1226 | 20.7488 | 36.10 | 99.50 | 0.37 |
|
48 |
-
| [Q5_1](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q5_1.gguf) | 1293 | 20.7773 | 38.07 | 99.37 | 0.37 |
|
49 |
-
| [Q6_K](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q6_K.gguf) | 1396 | 20.6994 | 41.11 | 99.74 | 0.37 |
|
50 |
-
| [Q8_0](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q8_0.gguf) | 1807 | 20.6659 | 53.21 | 99.90 | 0.37 |
|
51 |
-
| [F16](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-F16.gguf) | 3396 | 20.6457 | 100 | 100 | 0.37 |
|
52 |
-
|
53 |
-
<hr>
|
54 |
-
|
55 |
-
---
|
56 |
-
license: mit
|
57 |
-
train: false
|
58 |
-
inference: true
|
59 |
-
pipeline_tag: text-generation
|
60 |
-
base_model:
|
61 |
-
- deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
|
62 |
-
---
|
63 |
-
This is a version of the <a href="https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B">DeepSeek-R1-Distill-Qwen-1.5B</a> model re-distilled for better performance.
|
64 |
-
|
65 |
-
## Performance
|
66 |
-
|
67 |
-
| Models | <a href="https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B">DeepSeek-R1-Distill-Qwen-1.5B</a> | <a href="https://huggingface.co/mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1">DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1</a> |
|
68 |
-
|:-------------------:|:--------:|:----------------:|
|
69 |
-
| ARC (25-shot) | 40.96 | <b>41.55</b> |
|
70 |
-
| HellaSwag (10-shot)| 44 | <b>45.88</b> |
|
71 |
-
| MMLU (5-shot) | 39.27 | <b>41.82</b> |
|
72 |
-
| TruthfulQA-MC2 | 45.17 | <b>46.63</b> |
|
73 |
-
| Winogrande (5-shot)| 55.49 | <b>57.7</b> |
|
74 |
-
| GSM8K (5-shot) | 69.9 | <b>74.3</b> |
|
75 |
-
| Average | 49.13 | <b>51.31</b> |
|
76 |
-
|
77 |
-
| Models | <a href="https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B">DeepSeek-R1-Distill-Qwen-1.5B</a> | <a href="https://huggingface.co/mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1">DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1</a> |
|
78 |
-
|:-------------------:|:--------:|:----------------:|
|
79 |
-
| GPQA (0-shot) | 26.96 | <b>26.99</b> |
|
80 |
-
| MMLU PRO (5-shot) | 16.74 | <b>19.86</b> |
|
81 |
-
| MUSR (0-shot) | 35.93 | <b>36.6</b> |
|
82 |
-
| BBH (3-shot) | 35.12 | <b>37.23</b> |
|
83 |
-
| IfEval (0-shot) | 24.94 | <b>27.22</b> |
|
84 |
-
|
85 |
-
## Usage
|
86 |
-
```Python
|
87 |
-
import torch
|
88 |
-
from transformers import AutoModelForCausalLM, AutoTokenizer
|
89 |
-
compute_dtype = torch.bfloat16
|
90 |
-
device = 'cuda'
|
91 |
-
model_id = "mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1"
|
92 |
-
|
93 |
-
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=compute_dtype, attn_implementation="sdpa", device_map=device)
|
94 |
-
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
95 |
-
|
96 |
-
prompt = "What is 1.5+102.2?"
|
97 |
-
chat = tokenizer.apply_chat_template([{"role":"user", "content":prompt}], tokenize=True, add_generation_prompt=True, return_tensors="pt")
|
98 |
-
outputs = model.generate(chat.to(device), max_new_tokens=1024, do_sample=True)
|
99 |
-
print(tokenizer.decode(outputs[0]))
|
100 |
-
```
|
101 |
-
|
102 |
-
Output:
|
103 |
-
```
|
104 |
-
<|begin▁of▁sentence|><|User|>What is 1.5+102.2?<|Assistant|><think>
|
105 |
-
First, I identify the numbers involved in the addition: 1.5 and 102.2.
|
106 |
-
|
107 |
-
Next, I add the whole numbers: 1 + 102 equals 103.
|
108 |
-
|
109 |
-
Then, I add the decimal parts: 0.5 + 0.2 equals 0.7.
|
110 |
-
|
111 |
-
Finally, I combine the results: 103 + 0.7 equals 103.7.
|
112 |
-
</think>
|
113 |
-
|
114 |
-
To solve the addition \(1.5 + 102.2\), follow these steps:
|
115 |
-
|
116 |
-
1. **Add the whole numbers:**
|
117 |
-
\[
|
118 |
-
1 + 102 = 103
|
119 |
-
\]
|
120 |
-
|
121 |
-
2. **Add the decimal parts:**
|
122 |
-
\[
|
123 |
-
0.5 + 0.2 = 0.7
|
124 |
-
\]
|
125 |
-
|
126 |
-
3. **Combine the results:**
|
127 |
-
\[
|
128 |
-
103 + 0.7 = 103.7
|
129 |
-
\]
|
130 |
-
|
131 |
-
So, the final answer is \(\boxed{103.7}\).<|end▁of▁sentence|>
|
132 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|