Text Generation
GGUF
Llama 3.2
instruct
128k context
all use cases
maxed quants
Neo Imatrix
finetune
chatml
gpt4
synthetic data
distillation
function calling
json mode
axolotl
roleplaying
chat
reasoning
r1
vllm
thinking
cot
deepseek
Qwen2.5
Hermes
DeepHermes
DeepSeek
DeepSeek-R1-Distill
Uncensored
creative
general usage
problem solving
brainstorming
solve riddles
fiction writing
plot generation
sub-plot generation
story generation
scene continue
storytelling
fiction story
story
writing
fiction
swearing
horror
conversational
Update README.md
Browse files
README.md
CHANGED
@@ -125,7 +125,7 @@ Q8 is a maxed quant only, as imatrix has no effect on this quant.
|
|
125 |
|
126 |
Use this quant or F16 (full precision) for MAXIMUM reasoning/thinking performance.
|
127 |
|
128 |
-
Note that IQ1s performance is low, whereas IQ2s are passable
|
129 |
|
130 |
More information on quants is in the document below "Highest Quality Settings / Optimal Operation Guide / Parameters and Samplers".
|
131 |
|
|
|
125 |
|
126 |
Use this quant or F16 (full precision) for MAXIMUM reasoning/thinking performance.
|
127 |
|
128 |
+
Note that IQ1s performance is low, whereas IQ2s are passable (but reasoning is reduced ... try IQ3s min for reasoning cases)
|
129 |
|
130 |
More information on quants is in the document below "Highest Quality Settings / Optimal Operation Guide / Parameters and Samplers".
|
131 |
|