nuxlear owao commited on
Commit
8b91580
·
verified ·
1 Parent(s): b04e383

update -ngl arg to count the output layer (#6)

Browse files

- update -ngl arg to count the output layer (e46235c7812188677b9dee397be152d0b897fcd1)


Co-authored-by: blakkd <[email protected]>

Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -111,7 +111,7 @@ You can run EXAONE models locally using llama.cpp by following these steps:
111
  4. Generate result with greedy decoding.
112
  ```bash
113
  llama-cli -m EXAONE-4.0-32B-GGUF-Q4_K_M.gguf \
114
- -fa -ngl 64 \
115
  --temp 0.0 --top-k 1 \
116
  -f inputs.txt -no-cnv
117
  ```
@@ -124,7 +124,7 @@ You can run EXAONE models locally using llama.cpp by following these steps:
124
  3. Run llama-server with EXAONE 4.0 Jinja template. You can find the [chat template file](https://huggingface.co/LGAI-EXAONE/EXAONE-4.0-32B-GGUF/blob/main/chat_template.jinja) in this repository.
125
  ```bash
126
  llama-server -m EXAONE-4.0-32B-Q4_K_M.gguf \
127
- -c 131072 -fa -ngl 64 \
128
  --temp 0.6 --top-p 0.95 \
129
  --jinja --chat-template-file chat_template.jinja \
130
  --host 0.0.0.0 --port 8820 \
 
111
  4. Generate result with greedy decoding.
112
  ```bash
113
  llama-cli -m EXAONE-4.0-32B-GGUF-Q4_K_M.gguf \
114
+ -fa -ngl 65 \
115
  --temp 0.0 --top-k 1 \
116
  -f inputs.txt -no-cnv
117
  ```
 
124
  3. Run llama-server with EXAONE 4.0 Jinja template. You can find the [chat template file](https://huggingface.co/LGAI-EXAONE/EXAONE-4.0-32B-GGUF/blob/main/chat_template.jinja) in this repository.
125
  ```bash
126
  llama-server -m EXAONE-4.0-32B-Q4_K_M.gguf \
127
+ -c 131072 -fa -ngl 65 \
128
  --temp 0.6 --top-p 0.95 \
129
  --jinja --chat-template-file chat_template.jinja \
130
  --host 0.0.0.0 --port 8820 \