amd
/

DeepSeek-R1-0528-MXFP4-ASQ

8-bit precision

Model card Files Files and versions

linzhao-amd commited on about 1 month ago

Commit

df6afb4

·

verified ·

1 Parent(s): a94dc97

Update README.md

Files changed (1) hide show

README.md +9 -1

README.md CHANGED Viewed

@@ -129,9 +129,17 @@ for i in $(seq 1 10); do
         2>&1 | tee -a "$LOG"
 ```
-The result of GSM8K was obtained using [lm-eval-harness](https://github.com/EleutherAI/lm-evaluation-harness) and the following commands.
 ```
 MODEL_ARGS="model=amd/DeepSeek-R1-0528-MXFP4-ASQ,base_url=http://localhost:8000/v1/completions,num_concurrent=999999,timeout=999999,tokenized_requests=False,max_length=38768,temperature=0.6,top_p=0.95,add_bos_token=True,seed=$SEED"
 lm_eval \
     --model local-completions \

         2>&1 | tee -a "$LOG"
 ```
+The result of GSM8K was obtained using [lm-eval-harness](https://github.com/EleutherAI/lm-evaluation-harness) and [SGLang](https://docs.sglang.ai/), running with [docker](https://hub.docker.com/layers/lmsysorg/sglang/v0.5.3.post3-rocm700-mi35x-srt/images/sha256-8c7281fcd4adc7942c7e674d464fee322d1775d7b546596ab4cc7edd258517fc).
 ```
+# Launching server
+SGLANG_USE_AITER=1 python -m sglang.launch_server \
+    --model-path $MODEL_DIR \
+    --tp 8 \
+    --port 8000 \
+    --attention-backend aiter
+#
 MODEL_ARGS="model=amd/DeepSeek-R1-0528-MXFP4-ASQ,base_url=http://localhost:8000/v1/completions,num_concurrent=999999,timeout=999999,tokenized_requests=False,max_length=38768,temperature=0.6,top_p=0.95,add_bos_token=True,seed=$SEED"
 lm_eval \
     --model local-completions \