LLM360
/

K2-Think

@@ -71,7 +71,7 @@ We deploy K2-THINK on Cerebras Wafer-Scale Engine (WSE) systems, leveraging the
 | Platform                          | Throughput (tokens/sec) | Example: 32k-token response (time) |
 | --------------------------------- | ----------------------: | ---------------------------------: |
 | **Cerebras WSE (our deployment)** |             **\~2,000** |                         **\~16 s** |
-| Typical **H100/H200** GPU setup   |                   \~200 |                            \~160 s |
 ---

 | Platform                          | Throughput (tokens/sec) | Example: 32k-token response (time) |
 | --------------------------------- | ----------------------: | ---------------------------------: |
 | **Cerebras WSE (our deployment)** |             **\~2,000** |                         **\~16 s** |
+| Typical Cloud Service setup   |                   \~200 |                            \~160 s |
 ---