II-Search-CIR-4B-GGUF

II-Search-CIR-4B is a 4-billion-parameter language model built on Qwen3-4B and enhanced with Code-Integrated Reasoning (CIR), enabling it not only to call external tools (such as web search and web visit) through code blocks during inference, but also to programmatically process, filter, and reason over results within those code blocks; optimized through supervised fine-tuning and reinforcement learning on challenging reasoning datasets, the model achieves state-of-the-art or leading results on major factual QA and information-seeking benchmarks (like OpenAI/SimpleQA, Google/Frames, and Seal_0), and it can be efficiently deployed using vLLM or SGLang with up to 128k-token contexts (with YaRN RoPE scaling), supporting advanced research, educational, and web-integrated applications, with datasets, code samples, and evaluation results provided in the official Hugging Face repository.

Model Files

File Name	Size	Quant Type
II-Search-4B-GGUF.BF16.gguf	8.05 GB	BF16
II-Search-4B-GGUF.F16.gguf	8.05 GB	F16
II-Search-4B-GGUF.F32.gguf	16.1 GB	F32
II-Search-4B-GGUF.Q2_K.gguf	1.67 GB	Q2_K
II-Search-4B-GGUF.Q3_K_L.gguf	2.24 GB	Q3_K_L
II-Search-4B-GGUF.Q3_K_M.gguf	2.08 GB	Q3_K_M
II-Search-4B-GGUF.Q3_K_S.gguf	1.89 GB	Q3_K_S
II-Search-4B-GGUF.Q4_K_M.gguf	2.5 GB	Q4_K_M
II-Search-4B-GGUF.Q4_K_S.gguf	2.38 GB	Q4_K_S
II-Search-4B-GGUF.Q5_K_M.gguf	2.89 GB	Q5_K_M
II-Search-4B-GGUF.Q5_K_S.gguf	2.82 GB	Q5_K_S
II-Search-4B-GGUF.Q6_K.gguf	3.31 GB	Q6_K
II-Search-4B-GGUF.Q8_0.gguf	4.28 GB	Q8_0

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

prithivMLmods
/

II-Search-CIR-4B-GGUF

II-Search-CIR-4B-GGUF

Model Files

Quants Usage

Model tree for prithivMLmods/II-Search-CIR-4B-GGUF