DeepSeek-R1-0528-Qwen3-8B-GGUF

Direct GGUF Quantizations of DeepSeek-R1-0528-Qwen3-8B

This repository provides GGUF quantized models for deepseek-ai/DeepSeek-R1-0528-Qwen3-8B.

DeepSeek-R1-0528-Qwen3-8B is a powerful 8 billion parameter Large Language Model developed by DeepSeek AI. It is an Instruct model based on the Qwen3 architecture, excelling in a wide range of text generation tasks including chat, coding, and reasoning. These GGUF versions are optimized for efficient CPU and GPU inference using llama.cpp and compatible tools.

This release includes various quantization levels (e.g., Q2_K, Q3_K_M, Q4_K_M, Q5_K_M, Q6_K, Q8_0) to suit different hardware capabilities and performance requirements.

Table of Contents 📝

  1. Usage
  2. 📃 License
  3. 🙏 Acknowledgements

▶ Usage

1. Download Models

Download models using huggingface-cli:

pip install "huggingface_hub[cli]"
huggingface-cli download samuelchristlie/DeepSeek-R1-0528-Qwen3-8B-GGUF --local-dir ./DeepSeek-R1-0528-Qwen3-8B-GGUF

You can also download directly from this page

2. Inference

To use these GGUF files, you'll need a compatible inference engine like llama.cpp or clients built on top of it (e.g., Ollama, LM Studio, KoboldCpp, text-generation-webui with llama.cpp backend).

📃 License

This model is a GGUF conversion of the original deepseek-ai/DeepSeek-R1-0528-Qwen3-8B model. The original model is licensed under the MIT License, and this derivative work adheres to the terms of that license. Please review the original license for full details.

🙏 Acknowledgements

Downloads last month
2,583
GGUF
Model size
8.19B params
Architecture
qwen3
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for samuelchristlie/DeepSeek-R1-0528-Qwen3-8B-GGUF

Base model

Qwen/Qwen3-8B-Base
Finetuned
Qwen/Qwen3-8B
Quantized
(105)
this model