allura-quants/allura-org_Q3-8B-Kintsugi-GGUF

Static GGUF Quants of allura-org/Q3-8B-Kintsugi

GGUF quants of allura-org/Q3-8B-Kintsugi using llama.cpp for quantization.

Quant	Link	Notes
FP16	Download	Lossless. Not recommended, unless you have an excessive amount of VRAM.
Q8_0	Download	Basically lossless, half the size of FP16.
Q6_K	Download	Near-lossless, slightly smaller than Q8.
Q5_K_M	Download	Good quality/size balance; smaller than Q6_K with some loss.
Q5_K_S	Download	Slightly smaller than Q5_K_M, marginally more compressed but usable.
Q4_K_M	Download	Okay for some tasks; significantly smaller than Q5 variants.
Q4_K_S	Download	More compact than Q4_K_M; suitable for limited memory devices.
Q3_K_M	Download	Very small size; noticeable quality loss.
Q3_K_S	Download	Even more compact than Q3_K_M, even greater quality trade-off for size.
Q2_K	Download	Very small, minimal resources; near-incoherent for most usecases. Not recommended unless you are on a Samsung Galaxy S5.