Static GGUF Quants of allura-org/Q3-8B-Kintsugi
GGUF quants of allura-org/Q3-8B-Kintsugi using llama.cpp for quantization.
Quants
Quant | Link | Notes |
---|---|---|
FP16 | Download | Lossless. Not recommended, unless you have an excessive amount of VRAM. |
Q8_0 | Download | Basically lossless, half the size of FP16. |
Q6_K | Download | Near-lossless, slightly smaller than Q8. |
Q5_K_M | Download | Good quality/size balance; smaller than Q6_K with some loss. |
Q5_K_S | Download | Slightly smaller than Q5_K_M, marginally more compressed but usable. |
Q4_K_M | Download | Okay for some tasks; significantly smaller than Q5 variants. |
Q4_K_S | Download | More compact than Q4_K_M; suitable for limited memory devices. |
Q3_K_M | Download | Very small size; noticeable quality loss. |
Q3_K_S | Download | Even more compact than Q3_K_M, even greater quality trade-off for size. |
Q2_K | Download | Very small, minimal resources; near-incoherent for most usecases. Not recommended unless you are on a Samsung Galaxy S5. |
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support