cgus
/

GLM-4-9B-0414-exl2

Text Generation

4-bit precision

Model card Files Files and versions Community

cgus commited on Apr 24

Commit

12a60e3

·

verified ·

1 Parent(s): 1a33b16

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -20,7 +20,8 @@ Original model: [GLM-4-9B-0414](https://huggingface.co/THUDM/GLM-4-9B-0414) by [
 ## Qutantization notes
 Made with Exllamav2 0.2.9 with default dataset. These quants require Exllamav2 0.2.9 or newer.
-These quants can be used with TabbyAPI or Text-Generation-WebUI with RTX GPU (Windows) or RTX/ROCm (Linux).
 Ensure it fits your GPU VRAM since Exllamav2 doesn't support native RAM offloading.
 # GLM-4-9B-0414

 ## Qutantization notes
 Made with Exllamav2 0.2.9 with default dataset. These quants require Exllamav2 0.2.9 or newer.
+I had some issues with chat-completion mode with TabbyAPI because of some functions in Jinja2 template.
+But it should be usable in text-completion mode or with a custom template.
 Ensure it fits your GPU VRAM since Exllamav2 doesn't support native RAM offloading.
 # GLM-4-9B-0414