GPTQ INT8 W8A8 quantized Qwen/Qwen3-32B
GPTQ INT8 W8A8 quantized Qwen/Qwen3-32B calibrated with a sequence len of 4096 and 128 samples of TokenBender/code_instructions_122k_alpaca_style
, glaiveai/glaive-code-assistant-v2
, google/code_x_glue_ct_code_to_text
for a total sample size of 1024.
Follow the Qwen/Qwen3-32B docs for running with vllm.
- Downloads last month
- 67
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for zankich/Qwen3-32B-INT8
Base model
Qwen/Qwen3-32B