This is a converted weight from L3-8B-Stheno-v3.2 model in unsloth 4-bit dynamic quant using this collab notebook.

About this Conversion

This conversion uses Unsloth to load the model in 4-bit format and force-save it in the same 4-bit format.

How 4-bit Quantization Works

  • The actual 4-bit quantization is handled by BitsAndBytes (bnb), which works under Torch via AutoGPTQ or BitsAndBytes.
  • Unsloth acts as a wrapper, simplifying and optimizing the process for better efficiency.

This allows for reduced memory usage and faster inference while keeping the model compact.

Downloads last month
4
Safetensors
Model size
4.65B params
Tensor type
F16
·
F32
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for huggingkot/L3-8B-Stheno-v3.2-bnb-4bit

Finetuned
(11)
this model

Collection including huggingkot/L3-8B-Stheno-v3.2-bnb-4bit