demo space
2
#4 opened almost 2 years ago
by
matthoffner

Looks like the starchat-alpha-ggml-q4_1.bin is broken
8
#3 opened almost 2 years ago
by
xhyi

Which inference repo is this quantized for?
3
#2 opened almost 2 years ago
by
xhyi

Can the quantized model be loaded in gpu to have faster inference ?
6
#1 opened almost 2 years ago
by
MohamedRashad
