DeepSeek-R1-0528-Qwen3-8B-GGUF
Direct GGUF Quantizations of DeepSeek-R1-0528-Qwen3-8B
This repository provides GGUF quantized models for deepseek-ai/DeepSeek-R1-0528-Qwen3-8B.
DeepSeek-R1-0528-Qwen3-8B is a powerful 8 billion parameter Large Language Model developed by DeepSeek AI. It is an Instruct model based on the Qwen3 architecture, excelling in a wide range of text generation tasks including chat, coding, and reasoning. These GGUF versions are optimized for efficient CPU and GPU inference using llama.cpp
and compatible tools.
This release includes various quantization levels (e.g., Q2_K, Q3_K_M, Q4_K_M, Q5_K_M, Q6_K, Q8_0) to suit different hardware capabilities and performance requirements.
Table of Contents 📝
- ▶ Usage
- 📃 License
- 🙏 Acknowledgements
▶ Usage
1. Download Models
Download models using huggingface-cli
:
pip install "huggingface_hub[cli]"
huggingface-cli download samuelchristlie/DeepSeek-R1-0528-Qwen3-8B-GGUF --local-dir ./DeepSeek-R1-0528-Qwen3-8B-GGUF
You can also download directly from this page
2. Inference
To use these GGUF files, you'll need a compatible inference engine like llama.cpp
or clients built on top of it (e.g., Ollama
, LM Studio
, KoboldCpp
, text-generation-webui
with llama.cpp
backend).
📃 License
This model is a GGUF conversion of the original deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
model. The original model is licensed under the MIT License, and this derivative work adheres to the terms of that license. Please review the original license for full details.
🙏 Acknowledgements
- DeepSeek AI for developing and open-sourcing the powerful DeepSeek-R1-0528-Qwen3-8B model:
- The llama.cpp project and its contributors for the GGUF format and the incredible tooling that makes local LLM inference accessible.
- city96:
- Downloads last month
- 2,583
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit