Recommended

Jinx-Qwen3-4B-gguf-q6_k-q-8 (mixed-precision): selected weights (output, token embeddings, attention/FFN layers in first and last blocks) quantized to Q8_0, remaining tensors Q6_k, reducing memory footprint while preserving inference fidelity.

Downloads last month
552
GGUF
Model size
4.02B params
Architecture
qwen3
Hardware compatibility
Log In to view the estimation

4-bit

6-bit

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for marcelone/Jinx-Qwen3-4B-gguf

Base model

Qwen/Qwen3-4B-Base
Finetuned
Qwen/Qwen3-4B
Quantized
(1)
this model

Collection including marcelone/Jinx-Qwen3-4B-gguf