https://huggingface.co/zai-org/GLM-4.5-Air

#1213
by SabinStargem - opened

This is a model that weighs in at 106b+12 active parameters. I have no idea if it would be any good, so I am interested in giving this dark horse a try.

https://huggingface.co/zai-org/GLM-4.5-Air

It uses the Glm4MoeForCausalLM but llama.cpp currently unfortinately only supports the Glm4ForCausalLM and Glm4vForConditionalGeneration archidectures. Please follow https://github.com/ggml-org/llama.cpp/issues/14921 and let us know when support for Glm4MoeForCausalLM is implemented so we can quantize this model.

I think it is good to go. GLM 4.5 support has been merged into LlamaCPP. Note that MTP is not supported, dunno if a re-guffing would be needed in the future if that feature is added.

https://github.com/ggml-org/llama.cpp/pull/14939

Sign up or log in to comment