Static GGUF Quants of allura-org/Q3-8B-Kintsugi

GGUF quants of allura-org/Q3-8B-Kintsugi using llama.cpp for quantization.

Quants

Quant Link Notes
FP16 Download Lossless. Not recommended, unless you have an excessive amount of VRAM.
Q8_0 Download Basically lossless, half the size of FP16.
Q6_K Download Near-lossless, slightly smaller than Q8.
Q5_K_M Download Good quality/size balance; smaller than Q6_K with some loss.
Q5_K_S Download Slightly smaller than Q5_K_M, marginally more compressed but usable.
Q4_K_M Download Okay for some tasks; significantly smaller than Q5 variants.
Q4_K_S Download More compact than Q4_K_M; suitable for limited memory devices.
Q3_K_M Download Very small size; noticeable quality loss.
Q3_K_S Download Even more compact than Q3_K_M, even greater quality trade-off for size.
Q2_K Download Very small, minimal resources; near-incoherent for most usecases. Not recommended unless you are on a Samsung Galaxy S5.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for allura-quants/allura-org_Q3-8B-Kintsugi-GGUF

Base model

Qwen/Qwen3-8B-Base
Quantized
(13)
this model

Datasets used to train allura-quants/allura-org_Q3-8B-Kintsugi-GGUF