DavidAU commited on
Commit
3459788
·
verified ·
1 Parent(s): 5151f39

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -0
README.md ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model:
4
+ - Qwen/Qwen3-8B
5
+ pipeline_tag: text-generation
6
+ tags:
7
+ - horror
8
+ - 32 k context
9
+ - reasoning
10
+ - thinking
11
+ - qwen3
12
+ ---
13
+
14
+ (quants uploading...)
15
+
16
+ <H2>Qwen3-8B-NEO-Imatrix-Max-GGUF</H2>
17
+
18
+ NEO Imatrix Quants of new "Qwen 3 - 8B" model with MAX "output tensor" at BF16 to improve reasoning / output generation.
19
+
20
+ NEO Imatrix dataset was generated in house.
21
+
22
+ Imatrix effect will be stronger, the lower the quant you use with IQ4XS/IQ4NL being the best balanced quant for quality and Imatrix effect.
23
+
24
+ These quants will also be the strongest for creative use cases.
25
+
26
+ For stronger reasoning use higher quants.
27
+
28
+ Q8_0 quant is maxed only, as Imatrix has no effect on this quant.
29
+
30
+ F16 is full precision.
31
+
32
+ Context Length: 32 K + 8K output generation. (can be extended to 128k)
33
+
34
+ <B>NOTE - Jinja Template / Template to Use with this Model:</B>
35
+
36
+ If you are having issues with Jinja "auto template", use CHATML template.
37
+
38
+ OR (LMSTUDIO users / option)
39
+
40
+ Update the Jinja Template (go to this site, template-> copy the "Jinja template" and then paste.)
41
+
42
+ [ https://lmstudio.ai/neil/qwen3-thinking ]
43
+
44
+ <B>Other Notes:</B>
45
+
46
+ Reasoning is ON by default in this model, and model will auto-generate "think" block(s).
47
+
48
+ For benchmarks, usage info, settings please see org model card here:
49
+
50
+ [ https://huggingface.co/Qwen/Qwen3-8B ]
51
+
52
+ [ Model card, and examples to follow. ]