Quantized GGUF models for GRMR-V3-G4B

This repository contains GGUF quantized versions of qingy2024/GRMR-V3-G4B.

IMPORTANT

If using llama.cpp, make sure to set the --jinja flag at the end of your llama-cli or llama-server command, otherwise it won't use the right chat template!

Available quantizations:

FP16 (full precision)
Q2_K
Q3_K_L
Q3_K_M
Q3_K_S
Q4_K_M
Q4_K_S
Q5_K_M
Q5_K_S
Q6_K
Q8_0

Original model

This is a quantized version of qingy2024/GRMR-V3-G4B.

Generated on

Fri Jun 6 05:40:02 UTC 2025

qingy2024
/

GRMR-V3-G4B-GGUF

Quantized GGUF models for GRMR-V3-G4B

IMPORTANT

Available quantizations:

Original model

Generated on

Model tree for qingy2024/GRMR-V3-G4B-GGUF

Collection including qingy2024/GRMR-V3-G4B-GGUF

GRMR V3 GGUFs