DeepSeek-LLM-7B-Chat — RKLLM build for RK3588 boards
Author: @jamescallander
Source model: deepseek-ai/deepseek-llm-7b-chat
Target: Rockchip RK3588 NPU via RKNN-LLM Runtime
This repository hosts a conversion of
DeepSeek-LLM-7B-Chat
for use on Rockchip RK3588 equipped single board computers (Orange Pi 5 plus, Radxa Rock 5b+, Banana Pi M7, etc.). Conversion was performed using rknn-llm toolkit.
Conversion details
- RKLLM-Toolkit version: v1.2.1
- NPU driver: v0.9.8
- Python 3.11
- Quantization: w8a8_g128
- Output: single-file
.rkllm
artifact
Intended use
This build is intended for experimentation and deployment of DeepSeek-LLM-7B-Chat
on Rockchip RK3588-based SBCs.
Limitations
- Requires 9GB free memory.
- Conversion is quantized (w8a8_g128), so slight quality differences from the FP16 baseline may occur.
- Tested on Orange Pi 5 Plus, Orange Pi 5 Max, and Radxa Rock 5b+; other platforms may not be supported.
Quick start (RK3588)
1) Install runtime
The RKNN-LLM toolkit and instructions can be found on the specific development board's manufacturer website or from airockchip's github page.
Download and install the required packages as per the toolkit's instructions.
2) Simple Flask server deployment
The simplest way the deploy the .rkllm
converted model is using an example script provided in the toolkit in this directory: rknn-llm/examples/rkllm_server_demo
python3 <TOOLKIT_PATH>/rknn-llm/examples/rkllm_server_demo/flask_server.py \
--rkllm_model_path <MODEL_PATH>/rk3588-deepseek-llm-7b-chat.rkllm \
--target_platform rk3588
3) Sending a request
A basic format for message request is:
{
"model":"deepseek-7b",
"messages":[{
"role":"user",
"content":"<YOUR_PROMPT_HERE>"}],
"stream":false
}
Example request using curl
:
curl -s -X POST <MODEL_SERVER_IP_ADDRESS>:8080/rkllm_chat \
-H 'Content-Type: application/json' \
-d '{"model":"deepseek-7b","messages":[{"role":"user","content":"In one sentence, who was Napoleon?"}],"stream":false}'
The response is formated in the following way:
{
"choices":[{
"finish_reason":"stop",
"index":0,
"logprobs":null,
"message":{
"content":"<MODEL_REPLY_HERE>",
"role":"assistant"}}],
"created":null,
"id":"rkllm_chat",
"object":"rkllm_chat",
"usage":{
"completion_tokens":null,
"prompt_tokens":null,
"total_tokens":null}
}
Example response:
{"choices":[{"finish_reason":"stop","index":0,"logprobs":null,"message":{"content":"Napoleon Bonaparte (1769-1821) was a French military leader and statesman who rose to power during the French Revolution, becoming Emperor of France from 1804 to 1815 and implementing various reforms and conquests that had a lasting impact on European history.","role":"assistant"}}],"created":null,"id":"rkllm_chat","object":"rkllm_chat","usage":{"completion_tokens":null,"prompt_tokens":null,"total_tokens":null}}
4) UI compatibility
This server exposes an OpenAI-compatible Chat Completions API.
You can connect it to any OpenAI-compatible client or UI (for example: Open WebUI)
- Configure your client with the API base:
http://<SERVER_IP_ADDRESS>:8080
and use the endpoint:/rkllm_chat
- Make sure the
model
field matches the converted model’s name, for example:
{
"model": "DeepSeek-LLM-7B-Chat",
"messages": [{"role":"user","content":"Hello!"}],
"stream": false
}
License
This conversion follows the license of the source model: DeepSeek LLM license.
- Attribution: Built with DeepSeek (© 2023 DeepSeek).
- Modifications: quantization (w8a8_g128), export to
.rkllm
format for RK3588 SBCs. - Use Restrictions: The model and its derivatives may not be used for military purposes, harming minors, harassment, generating PII without authorization, fully automated binding decisions, or other prohibited uses listed in Attachment A of the DeepSeek License Agreement.
For more information on the deployment and use of .rkllm
models on RK3588 platforms, please refer to the RKNN-LLM toolkit documentation.
- Downloads last month
- 8
Model tree for jamescallander/deepseek-llm-7b-chat_w8a8_g128_rk3588.rkllm
Base model
deepseek-ai/deepseek-llm-7b-chat