Gemma 3 4B Instruct Quantized Models

This repo offers quantized versions of google/gemma-3-4b-it for use with llama.cpp. Quantization was done using an unofficial Docker image and calibrated on 100 rows from the agentlans/LinguaNova dataset to maintain coherence and multilingual support. The importance matrix file is included.

Limitations

  • Optimized for multilingual natural language tasks.
  • May underperform on math, coding, and untested multimodal features.
  • Shares all limitations and biases of the original Gemma 3 models.

Notes

  • Ideal for resource-constrained environments.
  • Test on your data for best results.
  • See the original google/gemma-3-4b-it page for full details and guidelines.

This card covers only the quantized models.

Downloads last month
48
GGUF
Model size
3.88B params
Architecture
gemma3
Hardware compatibility
Log In to view the estimation

4-bit

5-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for agentlans/gemma-3-4b-it-GGUF

Quantized
(108)
this model

Dataset used to train agentlans/gemma-3-4b-it-GGUF