linzhao-amd commited on
Commit
4eec6d2
·
verified ·
1 Parent(s): 518dcc8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -36,7 +36,6 @@ cd Quark/examples/torch/language_modeling/llm_ptq/
36
  exclude_layers="*self_attn* *mlp.gate.* *lm_head"
37
  python3 quantize_quark.py --model_dir $MODEL_DIR \
38
  --quant_scheme w_mxfp4_a_mxfp4 \
39
- --group_size 32 \
40
  --num_calib_data 128 \
41
  --exclude_layers $exclude_layers \
42
  --multi_gpu \
@@ -52,7 +51,8 @@ This model can be deployed efficiently using the [SGLang](https://docs.sglang.ai
52
 
53
  ## Evaluation
54
 
55
- The model was evaluated on AIME24, GPQA Diamond, and MATH-500 benchmarks using the [lighteval](https://github.com/huggingface/lighteval/tree/v0.10.0) framework. Each benchmark was run 10 times with different random seeds for reliable performance estimation.
 
56
 
57
  ### Accuracy
58
 
@@ -98,7 +98,7 @@ The model was evaluated on AIME24, GPQA Diamond, and MATH-500 benchmarks using t
98
  </td>
99
  </tr>
100
  <tr>
101
- <td>gsm8k
102
  </td>
103
  <td>95.30
104
  </td>
@@ -125,7 +125,7 @@ lighteval vllm $MODEL_ARGS "custom|aime24_single|0|0,custom|math_500_single|0|0,
125
  2>&1 | tee -a "$LOG"
126
  ```
127
 
128
- The result of gsm8k was obtained using lm-eval-harness.
129
 
130
  ```
131
  MODEL_ARGS="model=amd/DeepSeek-R1-0528-MXFP4-ASQ,base_url=http://localhost:8000/v1/completions,num_concurrent=999999,timeout=999999,tokenized_requests=False,max_length=38768,temperature=0.6,top_p=0.95,add_bos_token=True,seed=$SEED"
 
36
  exclude_layers="*self_attn* *mlp.gate.* *lm_head"
37
  python3 quantize_quark.py --model_dir $MODEL_DIR \
38
  --quant_scheme w_mxfp4_a_mxfp4 \
 
39
  --num_calib_data 128 \
40
  --exclude_layers $exclude_layers \
41
  --multi_gpu \
 
51
 
52
  ## Evaluation
53
 
54
+ The model was evaluated on AIME24, GPQA Diamond, MATH-500, and GSM8K benchmarks. The tasks of AIME24, GPQA Diamond, and MATH-500 were conducted using the [lighteval](https://github.com/huggingface/lighteval/tree/v0.10.0) framework, with each tasks running 10 times using different random seeds for reliable performance estimation.
55
+ The task of GSM8K was conducted
56
 
57
  ### Accuracy
58
 
 
98
  </td>
99
  </tr>
100
  <tr>
101
+ <td>GSM8K
102
  </td>
103
  <td>95.30
104
  </td>
 
125
  2>&1 | tee -a "$LOG"
126
  ```
127
 
128
+ The result of gsm8k was obtained using [lm-eval-harness](https://github.com/EleutherAI/lm-evaluation-harness).
129
 
130
  ```
131
  MODEL_ARGS="model=amd/DeepSeek-R1-0528-MXFP4-ASQ,base_url=http://localhost:8000/v1/completions,num_concurrent=999999,timeout=999999,tokenized_requests=False,max_length=38768,temperature=0.6,top_p=0.95,add_bos_token=True,seed=$SEED"