Update README.md
Browse files
README.md
CHANGED
@@ -6,7 +6,7 @@ license: cc-by-nc-4.0
|
|
6 |
|
7 |
Build a fastest OSS vllm-based speculative decoding system for your own model, using [ArcticTraining](https://github.com/snowflakedb/ArcticTraining) and [ArcticInference](https://github.com/snowflakedb/ArcticInference)!
|
8 |
|
9 |
-
We compare the throughput (tokens/s) of existing vllm-based speculative decoding
|
10 |
|
11 |
| method | ShareGPT | HumanEval |
|
12 |
|--------------------------------------|----------------|--------------|
|
|
|
6 |
|
7 |
Build a fastest OSS vllm-based speculative decoding system for your own model, using [ArcticTraining](https://github.com/snowflakedb/ArcticTraining) and [ArcticInference](https://github.com/snowflakedb/ArcticInference)!
|
8 |
|
9 |
+
We compare the throughput (tokens/s) of existing vllm-based speculative decoding systems for Llama3.1-70B-Instruct on 8xH100 as below:
|
10 |
|
11 |
| method | ShareGPT | HumanEval |
|
12 |
|--------------------------------------|----------------|--------------|
|