I wish more people would make exl3 quants. I probably will be making some for 24GB VRAM.

Prompt format

[gMASK]<sop><|system|>
{system_prompt}<|user|>
{prompt}<|assistant|>
<think>

Safetensors

Model size

8.97B params

Tensor type

FP16

I16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lmganon123/THUDM_GLM-Z1-32B-0414-exl3-4.0bpw

Base model

Quantized

(21)

this model