zankich
/

Qwen3-32B-INT8

Text Generation

text-generation-inference

text2text-generation

8-bit precision

compressed-tensors

Model card Files Files and versions Community

GPTQ INT8 W8A8 quantized Qwen/Qwen3-32B

GPTQ INT8 W8A8 quantized Qwen/Qwen3-32B calibrated with a sequence len of 4096 and 128 samples of TokenBender/code_instructions_122k_alpaca_style, glaiveai/glaive-code-assistant-v2, google/code_x_glue_ct_code_to_text for a total sample size of 1024.

Follow the Qwen/Qwen3-32B docs for running with vllm.

Downloads last month: 67

Safetensors

Model size

32.8B params

Tensor type

BF16

·

I8

·

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for zankich/Qwen3-32B-INT8

Base model

Qwen/Qwen3-32B

Quantized

(98)

this model

Datasets used to train zankich/Qwen3-32B-INT8