Commit
·
5425cac
1
Parent(s):
7cb16bf
Add instructions for Ollama
Browse files
README.md
CHANGED
|
@@ -101,6 +101,38 @@ python3 -m fastchat.serve.cli --model-path LLM360/AmberChat
|
|
| 101 |
| **LLM360/AmberChat** | **5.428125** |
|
| 102 |
| [Nous-Hermes-13B](https://huggingface.co/NousResearch/Nous-Hermes-13b) | 5.51 |
|
| 103 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 104 |
|
| 105 |
# Citation
|
| 106 |
|
|
|
|
| 101 |
| **LLM360/AmberChat** | **5.428125** |
|
| 102 |
| [Nous-Hermes-13B](https://huggingface.co/NousResearch/Nous-Hermes-13b) | 5.51 |
|
| 103 |
|
| 104 |
+
# Using Quantized Models with Ollama
|
| 105 |
+
|
| 106 |
+
Please follow these steps to use a quantized version of AmberChat on your personal computer or laptop:
|
| 107 |
+
|
| 108 |
+
1. First, install Ollama by following the instructions provided [here](https://github.com/jmorganca/ollama/tree/main?tab=readme-ov-file#ollama). Next, download a quantized model checkpoint (such as [amberchat.Q8_0.gguf](https://huggingface.co/TheBloke/AmberChat-GGUF/blob/main/amberchat.Q8_0.gguf) for the 8 bit version) from [TheBloke/AmberChat-GGUF](https://huggingface.co/TheBloke/AmberChat-GGUF/tree/main). Create an Ollama Modelfile locally using the template provided below:
|
| 109 |
+
```
|
| 110 |
+
FROM amberchat.Q8_0.gguf
|
| 111 |
+
|
| 112 |
+
TEMPLATE """{{ .System }}
|
| 113 |
+
USER: {{ .Prompt }}
|
| 114 |
+
ASSISTANT:
|
| 115 |
+
"""
|
| 116 |
+
SYSTEM """A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
|
| 117 |
+
"""
|
| 118 |
+
PARAMETER stop "USER:"
|
| 119 |
+
PARAMETER stop "ASSISTANT:"
|
| 120 |
+
PARAMETER repeat_last_n 0
|
| 121 |
+
PARAMETER num_ctx 2048
|
| 122 |
+
PARAMETER seed 0
|
| 123 |
+
PARAMETER num_predict -1
|
| 124 |
+
```
|
| 125 |
+
Ensure that the FROM directive points to the downloaded checkpoint file.
|
| 126 |
+
|
| 127 |
+
2. Now, you can proceed to build the model by running:
|
| 128 |
+
```bash
|
| 129 |
+
ollama create amberchat -f Modelfile
|
| 130 |
+
```
|
| 131 |
+
3. To run the model from the command line, execute the following:
|
| 132 |
+
```bash
|
| 133 |
+
ollama run amberchat
|
| 134 |
+
```
|
| 135 |
+
You need to build the model once and can just run it afterwards.
|
| 136 |
|
| 137 |
# Citation
|
| 138 |
|