writinwaters
commited on
Commit
·
c2523f0
1
Parent(s):
cd84e5d
Fixed a docusaurus display issue (#1431)
Browse files### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Documentation Update
docs/guides/deploy_local_llm.md
CHANGED
|
@@ -236,32 +236,28 @@ You may launch the Ollama service as below:
|
|
| 236 |
ollama serve
|
| 237 |
```
|
| 238 |
|
| 239 |
-
|
| 240 |
> Please set environment variable `OLLAMA_NUM_GPU` to `999` to make sure all layers of your model are running on Intel GPU, otherwise, some layers may run on CPU.
|
| 241 |
|
| 242 |
-
|
| 243 |
> If your local LLM is running on Intel Arc™ A-Series Graphics with Linux OS (Kernel 6.2), it is recommended to additionaly set the following environment variable for optimal performance before executing `ollama serve`:
|
| 244 |
>
|
| 245 |
> ```bash
|
| 246 |
> export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
| 247 |
> ```
|
| 248 |
|
| 249 |
-
|
| 250 |
> To allow the service to accept connections from all IP addresses, use `OLLAMA_HOST=0.0.0.0 ./ollama serve` instead of just `./ollama serve`.
|
| 251 |
|
| 252 |
The console will display messages similar to the following:
|
| 253 |
|
| 254 |
-
|
| 255 |
-
<img src="https://llm-assets.readthedocs.io/en/latest/_images/ollama_serve.png" width=100%; />
|
| 256 |
-
</a>
|
| 257 |
|
| 258 |
### 3. Pull and Run Ollama Model
|
| 259 |
|
| 260 |
Keep the Ollama service on and open another terminal and run `./ollama pull <model_name>` in Linux (`ollama.exe pull <model_name>` in Windows) to automatically pull a model. e.g. `qwen2:latest`:
|
| 261 |
|
| 262 |
-
|
| 263 |
-
<img src="https://llm-assets.readthedocs.io/en/latest/_images/ollama_pull.png" width=100%; />
|
| 264 |
-
</a>
|
| 265 |
|
| 266 |
#### Run Ollama Model
|
| 267 |
|
|
|
|
| 236 |
ollama serve
|
| 237 |
```
|
| 238 |
|
| 239 |
+
|
| 240 |
> Please set environment variable `OLLAMA_NUM_GPU` to `999` to make sure all layers of your model are running on Intel GPU, otherwise, some layers may run on CPU.
|
| 241 |
|
| 242 |
+
|
| 243 |
> If your local LLM is running on Intel Arc™ A-Series Graphics with Linux OS (Kernel 6.2), it is recommended to additionaly set the following environment variable for optimal performance before executing `ollama serve`:
|
| 244 |
>
|
| 245 |
> ```bash
|
| 246 |
> export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
| 247 |
> ```
|
| 248 |
|
| 249 |
+
|
| 250 |
> To allow the service to accept connections from all IP addresses, use `OLLAMA_HOST=0.0.0.0 ./ollama serve` instead of just `./ollama serve`.
|
| 251 |
|
| 252 |
The console will display messages similar to the following:
|
| 253 |
|
| 254 |
+

|
|
|
|
|
|
|
| 255 |
|
| 256 |
### 3. Pull and Run Ollama Model
|
| 257 |
|
| 258 |
Keep the Ollama service on and open another terminal and run `./ollama pull <model_name>` in Linux (`ollama.exe pull <model_name>` in Windows) to automatically pull a model. e.g. `qwen2:latest`:
|
| 259 |
|
| 260 |
+

|
|
|
|
|
|
|
| 261 |
|
| 262 |
#### Run Ollama Model
|
| 263 |
|