Update README.md
Browse files
README.md
CHANGED
@@ -20,7 +20,8 @@ Original model: [GLM-4-9B-0414](https://huggingface.co/THUDM/GLM-4-9B-0414) by [
|
|
20 |
|
21 |
## Qutantization notes
|
22 |
Made with Exllamav2 0.2.9 with default dataset. These quants require Exllamav2 0.2.9 or newer.
|
23 |
-
|
|
|
24 |
Ensure it fits your GPU VRAM since Exllamav2 doesn't support native RAM offloading.
|
25 |
# GLM-4-9B-0414
|
26 |
|
|
|
20 |
|
21 |
## Qutantization notes
|
22 |
Made with Exllamav2 0.2.9 with default dataset. These quants require Exllamav2 0.2.9 or newer.
|
23 |
+
I had some issues with chat-completion mode with TabbyAPI because of some functions in Jinja2 template.
|
24 |
+
But it should be usable in text-completion mode or with a custom template.
|
25 |
Ensure it fits your GPU VRAM since Exllamav2 doesn't support native RAM offloading.
|
26 |
# GLM-4-9B-0414
|
27 |
|