cgus
/

Galactic-Qwen-14B-Exp2-exl2

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions Community

cgus commited on Mar 18

Commit

29f62cb

·

verified ·

1 Parent(s): 17eb406

Update README.md

Files changed (1) hide show

README.md +18 -2

README.md CHANGED Viewed

@@ -3,9 +3,9 @@ license: apache-2.0
 language:
 - en
 base_model:
-- prithivMLmods/Lacerta-Opus-14B-Elite8
 pipeline_tag: text-generation
-library_name: transformers
 tags:
 - text-generation-inference
 - math
@@ -112,6 +112,22 @@ model-index:
       url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FGalactic-Qwen-14B-Exp2
       name: Open LLM Leaderboard
 ---
 ![gsdfgsdfrgs.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/pWwijAiXQlRWE93wfvCcb.png)
 # **Galactic-Qwen-14B-Exp2**

 language:
 - en
 base_model:
+- prithivMLmods/Galactic-Qwen-14B-Exp2
 pipeline_tag: text-generation
+library_name: exllamav2
 tags:
 - text-generation-inference
 - math
       url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FGalactic-Qwen-14B-Exp2
       name: Open LLM Leaderboard
 ---
+# Galactic-Qwen-14B-Exp2-exl2
+Original model: [Galactic-Qwen-14B-Exp2](https://huggingface.co/prithivMLmods/Galactic-Qwen-14B-Exp2) by [prithivMLmods](https://huggingface.co/prithivMLmods)
+Based on: [Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct) by [Qwen](https://huggingface.co/Qwen)
+## Quants
+[4bpw h6 (main)](https://huggingface.co/cgus/Galactic-Qwen-14B-Exp2-exl2/tree/main)
+[4.5bpw h6](https://huggingface.co/cgus/Galactic-Qwen-14B-Exp2-exl2/tree/4.5bpw-h6)
+[5bpw h6](https://huggingface.co/cgus/Galactic-Qwen-14B-Exp2-exl2/tree/5bpw-h6)
+[6bpw h6](https://huggingface.co/cgus/Galactic-Qwen-14B-Exp2-exl2/tree/6bpw-h6)
+[8bpw h8](https://huggingface.co/cgus/Galactic-Qwen-14B-Exp2-exl2/tree/8bpw-h8)
+## Quantization notes
+Made with Exllamav2 0.2.8 with default calibration dataset.
+Exl2 models can be used with TabbyAPI, Text-Generation-WebUI and some other apps.
+It requires to use Nvidia RTX GPU on Windows or Nvidia RTX/AMD ROCm on Linux.
+Model has to fit VRAM to be used properly. If it's bigger than your GPU can handle, please use GGUF quants for llama.cpp apps instead.
+# Original model card
 ![gsdfgsdfrgs.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/pWwijAiXQlRWE93wfvCcb.png)
 # **Galactic-Qwen-14B-Exp2**