YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
This repo contains serialized blobs of an up projection layer of llama3-8B (oc=14336, ic=4096). The linear layer has been quantized (GPTQ W4 Sym with group size 32) and sparsified by 50%.
βββ sparse_w4
β βββ linear_bitmap_int32.bin
β βββ linear_compressed_qweight_int32.bin
β βββ linear_nnz_int16.bin
β βββ linear_scales_float16.bin
β βββ linear_zeros_int32.bin
Usage
The following script shows how to process the blobs in python. It shows unpacking, zero location recovery, as well as weight dequantization process.
python unpack_blobs.py
you can ignore
internal/
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support