GGUF?

#2
by MoonRide - opened

CFT looks interesting, but it would be nice to have some GGUFs for it, so we can do quick local evaluations. IQ3_XS+ quants are quite usable, and allow to run ~30B models on a 16 GB VRAM GPUs (with reduced context size). @bartowski ?

@bartowski Thank you very much! <3

MoonRide changed discussion status to closed

Sign up or log in to comment