jerryzh168 commited on
Commit
69fb0e9
·
verified ·
1 Parent(s): 800c265

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -0
README.md CHANGED
@@ -98,6 +98,15 @@ lm_eval --model hf --model_args pretrained=jerryzh168/phi4-mini-float8dq --tasks
98
 
99
  # Model Performance
100
 
 
 
 
 
 
 
 
 
 
101
  ## Download vllm source code and install vllm
102
  ```
103
  git clone [email protected]:vllm-project/vllm.git
 
98
 
99
  # Model Performance
100
 
101
+ ## Results (H100 machine)
102
+ | Benchmark | | |
103
+ |----------------------------------|----------------|--------------------------|
104
+ | | Phi-4 mini-Ins | phi4-mini-float8dq |
105
+ | latency (batch_size=1) | 1.64 s | 1.41s (16% speedup) |
106
+ | latency (batch_size=128) | 3.1 s | 2.72s (14% speedup) |
107
+ | serving (num_prompts=1) | 1.35 req/s | 1.57 req/s (16% speedup) |
108
+ | serving (num_prompts=1000) | 66.68 req/s | 80.53 req/s (21% speedup)|
109
+
110
  ## Download vllm source code and install vllm
111
  ```
112
  git clone [email protected]:vllm-project/vllm.git