dranger003
/

Senku-70B-iMat.GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

Edit model card

GGUF importance matrix (imatrix) quants for https://huggingface.co/ShinojiResearch/Senku-70B-Full
The importance matrix was trained for ~50K tokens (105 batches of 512 tokens) using a general purpose imatrix calibration dataset.
The imatrix is being used on the K-quants as well.

2024-02-26: Updating quants - IQ3_M/IQ3_S/IQ3_XS and IQ2_M/IQ2_S (requires latest commit a33e6a0d).

Layers	Context	Template
80	32764	<\|im_start\|>system {instructions}<\|im_end\|> <\|im_start\|>user {prompt}<\|im_end\|> <\|im_start\|>assistant {response}

Downloads last month: 176

GGUF

Model size

69B params

Architecture

llama

1-bit

2-bit

3-bit

4-bit

Inference Examples

Text Generation

Inference API (serverless) does not yet support gguf models for this pipeline type.