cgus
/

Qwen2.5-0.5B-Instruct-exl2

Text Generation

4-bit precision

Model card Files Files and versions Community

cgus commited on Nov 27, 2024

Commit

a1b0e0f

·

verified ·

1 Parent(s): ed508d7

Update README.md

Files changed (1) hide show

README.md +21 -3

README.md CHANGED Viewed

@@ -4,12 +4,30 @@ license_link: https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct/blob/main/LICENS
 language:
 - en
 pipeline_tag: text-generation
-base_model: Qwen/Qwen2.5-0.5B
 tags:
 - chat
-library_name: transformers
 ---
 # Qwen2.5-0.5B-Instruct
 ## Introduction

 language:
 - en
 pipeline_tag: text-generation
+base_model: Qwen/Qwen2.5-0.5B-Instruct
 tags:
 - chat
+library_name: Exllamav2
 ---
+# Qwen2.5-0.5B-Instruct-exl2
+Model: [Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct)
+Creator: [Qwen](https://huggingface.co/Qwen)
+## Quants
+[4bpw h6 (main)](https://huggingface.co/cgus/Qwen2.5-0.5B-Instruct-exl2/tree/main)
+[4.5bpw h6](https://huggingface.co/cgus/Qwen2.5-0.5B-Instruct-exl2/tree/4.5bpw-h6)
+[5bpw h6](https://huggingface.co/cgus/Qwen2.5-0.5B-Instruct-exl2/tree/5bpw-h6)
+[5.5bpw h6](https://huggingface.co/cgus/Qwen2.5-0.5B-Instruct-exl2/tree/5.5bpw-h6)
+[6bpw h6](https://huggingface.co/cgus/Qwen2.5-0.5B-Instruct-exl2/tree/6bpw-h6)
+[6.5bpw h6](https://huggingface.co/cgus/Qwen2.5-0.5B-Instruct-exl2/tree/6.5bpw-h6)
+[8bpw h8](https://huggingface.co/cgus/Qwen2.5-0.5B-Instruct-exl2/tree/8bpw-h8)
+## Quantization notes
+Made with Exllamav2 0.2.4 with the default dataset.
+This model is meant to be used with RTX2000 or newer cards on Windows. On Linux it's possible to use ROCm AMD cards as well.
+These quants might be useful as a draft model for TabbyAPI. For other purposes imho it's better to use bigger models.
+# Original model card
 # Qwen2.5-0.5B-Instruct
 ## Introduction