About

IQ2_KS, IQ4_KS, IQ5_KS are Gen 2 IQ_K Quants from Ikawrakow. They are faster (PP for sure, TG maybe) than the gen 1 IK_Quants (IQ2_K to IQ6_K). They'll work with my last release of Croco.Cpp.

https://github.com/Nexesenex/croco.cpp/releases/tag/v1.92055_b5145_RM1.102

And of course on IK_Llama.cpp

Cuda Pascal or more recent GPU needed, I didn't adapt or compile for anything else.

Downloads last month: 64

GGUF

Model size

23.6B params

Architecture

llama

Hardware compatibility

2-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NexesQuants/mistral-small-3.1-24b-instruct-2503-iMat-IKLQ-GGUF

Base model

mistralai/Mistral-Small-3.1-24B-Base-2503

Finetuned

mistralai/Mistral-Small-3.1-24B-Instruct-2503

Quantized

(65)

this model