https://huggingface.co/zai-org/GLM-4.5-Air
This is a model that weighs in at 106b+12 active parameters. I have no idea if it would be any good, so I am interested in giving this dark horse a try.
It uses the Glm4MoeForCausalLM
but llama.cpp currently unfortinately only supports the Glm4ForCausalLM
and Glm4vForConditionalGeneration
archidectures. Please follow https://github.com/ggml-org/llama.cpp/issues/14921 and let us know when support for Glm4MoeForCausalLM
is implemented so we can quantize this model.
I think it is good to go. GLM 4.5 support has been merged into LlamaCPP. Note that MTP is not supported, dunno if a re-guffing would be needed in the future if that feature is added.
All the GLM 4.5 models got queued yesturday evening shortly after GLM 4.5 support was merged: https://huggingface.co/mradermacher/model_requests/discussions/1239
This includes GLM 4.5 Air:
Download page: https://hf.tst.eu/model#GLM-4.5-Air-GGUF
Static quants: https://huggingface.co/mradermacher/GLM-4.5-Air-GGUF/tree/main
imatrix quants: https://huggingface.co/mradermacher/GLM-4.5-Air-i1-GGUF/tree/main