This model is created with the following code:
from datasets import load_dataset
from gptqmodel import GPTQModel, QuantizeConfig
from huggingface_hub import constants
model_id = "Qwen/Qwen3-32B"
# Save the quantized model in the HF cache directory
cache_dir = constants.HF_HUB_CACHE
quant_path = os.path.join(cache_dir, "models--quantized--" + model_id.replace("/", "--") + "custom--calibration")
os.makedirs(quant_path, exist_ok=True)
# Load calibration data
calibration_dataset = []
with open("./data/custom_calibration_dataset.jsonl", "r") as f:
for line in f:
if line.strip(): # Skip empty lines
item = json.loads(line)
calibration_dataset.append(item["text"])
# Configure and run quantization
quant_config = QuantizeConfig(bits=4, group_size=128)
model = GPTQModel.load(model_id, quant_config)
model.quantize(calibration_dataset, batch_size=2)
model.save(quant_path)
- Downloads last month
- 9
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
馃檵
Ask for provider support
Model tree for coco101010/Qwen3-32B-GPTQ-4bit-custom-calibration
Base model
Qwen/Qwen3-32B