agentlans
/

gemma-3-4b-it-GGUF

instruction-tuned

natural-language-processing

importance-matrix

resource-efficient

Model card Files Files and versions

Gemma 3 4B Instruct Quantized Models

This repo offers quantized versions of google/gemma-3-4b-it for use with llama.cpp. Quantization was done using an unofficial Docker image and calibrated on 100 rows from the agentlans/LinguaNova dataset to maintain coherence and multilingual support. The importance matrix file is included.

Limitations

Optimized for multilingual natural language tasks.
May underperform on math, coding, and untested multimodal features.
Shares all limitations and biases of the original Gemma 3 models.

Notes

Ideal for resource-constrained environments.
Test on your data for best results.
See the original google/gemma-3-4b-it page for full details and guidelines.

This card covers only the quantized models.

Downloads last month: 48

GGUF

Model size

3.88B params

Architecture

gemma3

Hardware compatibility

Log In to view the estimation

4-bit

5-bit

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for agentlans/gemma-3-4b-it-GGUF

Base model

google/gemma-3-4b-pt

Finetuned

google/gemma-3-4b-it

Quantized

(108)

this model

Dataset used to train agentlans/gemma-3-4b-it-GGUF