I wish more people would make exl3 quants. I probably will be making some for 24GB VRAM.

Prompt format

[gMASK]<sop><|system|>
{system_prompt}<|user|>
{prompt}<|assistant|>
<think>
Downloads last month
11
Safetensors
Model size
8.97B params
Tensor type
FP16
·
I16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lmganon123/THUDM_GLM-Z1-32B-0414-exl3-4.0bpw

Quantized
(21)
this model