Recommended way to run this model:
llama-server -hf ggml-org/gemma-3-1b-it-qat-GGUF -c 0 -fa
Then, access http://localhost:8080
Chat template
4-bit