DavidAU
/

Qwen3-8B-NEO-Imatrix-Max-GGUF

Text Generation

Model card Files Files and versions Community

DavidAU commited on Apr 30

Commit

3459788

·

verified ·

1 Parent(s): 5151f39

Create README.md

Files changed (1) hide show

README.md +52 -0

README.md ADDED Viewed

	@@ -0,0 +1,52 @@

+---
+license: apache-2.0
+base_model:
+- Qwen/Qwen3-8B
+pipeline_tag: text-generation
+tags:
+- horror
+- 32 k context
+- reasoning
+- thinking
+- qwen3
+---
+(quants uploading...)
+<H2>Qwen3-8B-NEO-Imatrix-Max-GGUF</H2>
+NEO Imatrix Quants of new "Qwen 3 - 8B" model with MAX "output tensor" at BF16 to improve reasoning / output generation.
+NEO Imatrix dataset was generated in house.
+Imatrix effect will be stronger, the lower the quant you use with IQ4XS/IQ4NL being the best balanced quant for quality and Imatrix effect.
+These quants will also be the strongest for creative use cases.
+For stronger reasoning use higher quants.
+Q8_0 quant is maxed only, as Imatrix has no effect on this quant.
+F16 is full precision.
+Context Length: 32 K + 8K output generation. (can be extended to 128k)
+<B>NOTE - Jinja Template / Template to Use with this Model:</B>
+If you are having issues with Jinja "auto template", use CHATML template.
+OR (LMSTUDIO users / option)
+Update the Jinja Template (go to this site, template-> copy the "Jinja template" and then paste.)
+[ https://lmstudio.ai/neil/qwen3-thinking ]
+<B>Other Notes:</B>
+Reasoning is ON by default in this model, and model will auto-generate "think" block(s).
+For benchmarks, usage info, settings please see org model card here:
+[ https://huggingface.co/Qwen/Qwen3-8B ]
+[ Model card, and examples to follow. ]