For this quantization, we used 1 codebook of 16 bits.

Results:

Model	Quantization	MMLU (5-shot)	GSM8k (8-shot)	Model size, Gb
CohereForAI/c4ai-command-r-v01	None	0.6755	0.6065	70.0
	1x16	0.5719	0.3760	12.7

Safetensors

Model size

6.36B params

Tensor type

FP16

I16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including ISTA-DASLab/c4ai-command-r-v01-AQLM-2Bit-1x16