Update README.md
Browse files
README.md
CHANGED
@@ -23,6 +23,7 @@ GRPO is applied after a distilled R1 model is created to further refine its reas
|
|
23 |
[https://huggingface.co/Dongwei/DeepSeek-R1-Distill-Qwen-7B-GRPO_Math](https://huggingface.co/Dongwei/DeepSeek-R1-Distill-Qwen-7B-GRPO_Math)
|
24 |
|
25 |
- Converted to MLX format with a quantization of 8-bit for better performance on Apple Silicon Macs.
|
|
|
26 |
|
27 |
# Notes:
|
28 |
- Seems to brush over the "thinking" process and immediately start answering, leading to extremely quick but correct answers.
|
|
|
23 |
[https://huggingface.co/Dongwei/DeepSeek-R1-Distill-Qwen-7B-GRPO_Math](https://huggingface.co/Dongwei/DeepSeek-R1-Distill-Qwen-7B-GRPO_Math)
|
24 |
|
25 |
- Converted to MLX format with a quantization of 8-bit for better performance on Apple Silicon Macs.
|
26 |
+
- If you want a smaller (quantized) model, see the models below.
|
27 |
|
28 |
# Notes:
|
29 |
- Seems to brush over the "thinking" process and immediately start answering, leading to extremely quick but correct answers.
|