DeepSeek R1 0528 Qwen3 8B GGUF

Original model: DeepSeek-R1-0528-Qwen3-8B

Model creator: DeepSeek AI

We distilled the chain-of-thought from DeepSeek-R1-0528 to post-train Qwen3 8B Base, obtaining DeepSeek-R1-0528-Qwen3-8B. This model achieves state-of-the-art (SOTA) performance among open-source models on the AIME 2024, surpassing Qwen3 8B by +10.0% and matching the performance of Qwen3-235B-thinking. We believe that the chain-of-thought from DeepSeek-R1-0528 will hold significant importance for both academic research on reasoning models and industrial development focused on small-scale models.

This repo contains GGUF format model files for DeepSeek AI's DeepSeek R1 0528 Qwen3 8B.

What is GGUF?

GGUF is a file format for representing AI models. It is the third version of the format, introduced by the llama.cpp team on August 21st 2023.

Converted with llama.cpp build b5536 (revision 2b13162), using autogguf-rs.

Prompt template: DeepSeek R1

{{system_message}}

<|User|>{{prompt}}<|Assistant|>

Notes from DeepSeek on Running Locally

Compared to previous versions of DeepSeek-R1, the usage recommendations for DeepSeek-R1-0528 have the following changes:

  • System prompt is supported now.
  • It is not required to add <think>\n at the beginning of the output to force the model into thinking pattern.

The model architecture of DeepSeek-R1-0528-Qwen3-8B is identical to that of Qwen3-8B, but it shares the same tokenizer configuration as DeepSeek-R1-0528.


Download & run with cnvrs on iPhone, iPad, and Mac!

cnvrs.ai

cnvrs is the best app for private, local AI on your device:


Original Model Evaluation

We distilled the chain-of-thought from DeepSeek-R1-0528 to post-train Qwen3 8B Base, obtaining DeepSeek-R1-0528-Qwen3-8B. This model achieves state-of-the-art (SOTA) performance among open-source models on the AIME 2024, surpassing Qwen3 8B by +10.0% and matching the performance of Qwen3-235B-thinking.

AIME 24 AIME 25 HMMT Feb 25 GPQA Diamond LiveCodeBench (2408-2505)
Qwen3-235B-A22B 85.7 81.5 62.5 71.1 66.5
Qwen3-32B 81.4 72.9 - 68.4 -
Qwen3-8B 76.0 67.3 - 62.0 -
Phi-4-Reasoning-Plus-14B 81.3 78.0 53.6 69.3 -
Gemini-2.5-Flash-Thinking-0520 82.3 72.0 64.2 82.8 62.3
o3-mini (medium) 79.6 76.7 53.3 76.8 65.9
DeepSeek-R1-0528-Qwen3-8B 86.0 76.3 61.5 61.1 60.5

DeepSeek R1 0528 Qwen3 8B in cnvrs on iOS

deepseek-r1-qwen3-8b in cnvrs pt1 deepseek-r1-qwen3-8b in cnvrs pt2


Downloads last month
777
GGUF
Model size
8.19B params
Architecture
qwen3
Hardware compatibility
Log In to view the estimation

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for brittlewis12/DeepSeek-R1-0528-Qwen3-8B-GGUF

Quantized
(62)
this model