DeepSeek-R1-0528-Qwen3-8B-GGUF

Direct GGUF Quantizations of DeepSeek-R1-0528-Qwen3-8B

This repository provides GGUF quantized models for deepseek-ai/DeepSeek-R1-0528-Qwen3-8B.

DeepSeek-R1-0528-Qwen3-8B is a powerful 8 billion parameter Large Language Model developed by DeepSeek AI. It is an Instruct model based on the Qwen3 architecture, excelling in a wide range of text generation tasks including chat, coding, and reasoning. These GGUF versions are optimized for efficient CPU and GPU inference using llama.cpp and compatible tools.

This release includes various quantization levels (e.g., Q2_K, Q3_K_M, Q4_K_M, Q5_K_M, Q6_K, Q8_0) to suit different hardware capabilities and performance requirements.

Table of Contents 📝

▶ Usage
📃 License
🙏 Acknowledgements

▶ Usage

1. Download Models

Download models using huggingface-cli:

pip install "huggingface_hub[cli]"
huggingface-cli download samuelchristlie/DeepSeek-R1-0528-Qwen3-8B-GGUF --local-dir ./DeepSeek-R1-0528-Qwen3-8B-GGUF

You can also download directly from this page

2. Inference

To use these GGUF files, you'll need a compatible inference engine like llama.cpp or clients built on top of it (e.g., Ollama, LM Studio, KoboldCpp, text-generation-webui with llama.cpp backend).

📃 License

This model is a GGUF conversion of the original deepseek-ai/DeepSeek-R1-0528-Qwen3-8B model. The original model is licensed under the MIT License, and this derivative work adheres to the terms of that license. Please review the original license for full details.

🙏 Acknowledgements

DeepSeek AI for developing and open-sourcing the powerful DeepSeek-R1-0528-Qwen3-8B model:
- DeepSeek-R1-0528-Qwen3-8B on Hugging Face
The llama.cpp project and its contributors for the GGUF format and the incredible tooling that makes local LLM inference accessible.
- llama.cpp GitHub Repository
city96:
- https://huggingface.co/city96

samuelchristlie
/

DeepSeek-R1-0528-Qwen3-8B-GGUF