Upload README.md
Browse files
README.md
CHANGED
@@ -28,12 +28,9 @@ Original model: https://huggingface.co/Deci/DeciLM-7B-Instruct
|
|
28 |
|
29 |
```
|
30 |
|
31 |
-
Modified llama.cpp to support DeciLMCausalModel's variable Grouped Query Attention.
|
32 |
-
|
33 |
-
However, llama.cpp doesn't support it yet, so my modifification also just ignore
|
34 |
-
the dynamic NTK-ware RoPE scaling settings in the config.json. Since the ggufs seem
|
35 |
-
working, for the time being just use them as is until I figure out how to implement
|
36 |
-
dynamic NTK-ware RoPE scaling.
|
37 |
|
38 |
## Download a file (not the whole branch) from below:
|
39 |
|
|
|
28 |
|
29 |
```
|
30 |
|
31 |
+
[Modified llama.cpp](https://github.com/ymcki/llama.cpp-b4139) to support DeciLMCausalModel's variable Grouped Query Attention. Please download it and compile it to run the GGUFs in this repository.
|
32 |
+
|
33 |
+
Please note that the HF model of Deci-7B-Instruct uses dynamic NTK-ware RoPE scaling. However, llama.cpp doesn't support it yet, so my modifification also just ignore the dynamic NTK-ware RoPE scaling setting in the config.json. Since the ggufs seem working for the time being, please just use them as is until I figure out how to implement dynamic NTK-ware RoPE scaling.
|
|
|
|
|
|
|
34 |
|
35 |
## Download a file (not the whole branch) from below:
|
36 |
|