Update README.md
Browse files
README.md
CHANGED
@@ -47,7 +47,7 @@ You also have to have the model on a CUDA device.
|
|
47 |
|
48 |
The recommended way to perform efficient inference with Jamba Mini 1.6 is using [vLLM](https://docs.vllm.ai/en/latest/). First, make sure to install vLLM (version 0.5.4 or higher is required)
|
49 |
```bash
|
50 |
-
pip install vllm>=0.5.4
|
51 |
```
|
52 |
|
53 |
In the example below, `number_gpus` should match the number of GPUs you want to deploy Jamba Mini 1.6 on. A minimum of 2 80GB GPUs is required.
|
|
|
47 |
|
48 |
The recommended way to perform efficient inference with Jamba Mini 1.6 is using [vLLM](https://docs.vllm.ai/en/latest/). First, make sure to install vLLM (version 0.5.4 or higher is required)
|
49 |
```bash
|
50 |
+
pip install "vllm>=0.5.4"
|
51 |
```
|
52 |
|
53 |
In the example below, `number_gpus` should match the number of GPUs you want to deploy Jamba Mini 1.6 on. A minimum of 2 80GB GPUs is required.
|