About

IQ2_KS, IQ4_KS, IQ5_KS are Gen 2 IQ_K Quants from Ikawrakow. They are faster (PP for sure, TG maybe) than the gen 1 IK_Quants (IQ2_K to IQ6_K). They'll work with my last release of Croco.Cpp.

https://github.com/Nexesenex/croco.cpp/releases/tag/v1.92055_b5145_RM1.102

And of course on IK_Llama.cpp

Cuda Pascal or more recent GPU needed, I didn't adapt or compile for anything else.

Downloads last month
73
GGUF
Model size
23.6B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

2-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for NexesQuants/mistral-small-3.1-24b-instruct-2503-iMat-IKLQ-GGUF