Snowflake
/

Arctic-LSTM-Speculator-Llama-3.1-8B-Instruct

Model card Files Files and versions

jeffra commited on Apr 30

Commit

392c8ca

·

verified ·

1 Parent(s): 4b94a29

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -6,7 +6,7 @@ license: cc-by-nc-4.0
 Build a fastest OSS vllm-based speculative decoding system for your own model, using [ArcticTraining](https://github.com/snowflakedb/ArcticTraining) and [ArcticInference](https://github.com/snowflakedb/ArcticInference)!
-We compare the throughput (tokens/s) of existing vllm-based speculative decoding systmes for Llama3.1-70B-Instruct on 8xH100 as below:
 | method                                 | ShareGPT      | HumanEval |
 |--------------------------------------|----------------|--------------|

 Build a fastest OSS vllm-based speculative decoding system for your own model, using [ArcticTraining](https://github.com/snowflakedb/ArcticTraining) and [ArcticInference](https://github.com/snowflakedb/ArcticInference)!
+We compare the throughput (tokens/s) of existing vllm-based speculative decoding systems for Llama3.1-70B-Instruct on 8xH100 as below:
 | method                                 | ShareGPT      | HumanEval |
 |--------------------------------------|----------------|--------------|