--- license: apache-2.0 base_model: Qwen/Qwen3-32B tags: - mlx - 3bit - quantized --- # Qwen3-32B 3bit MLX This model is a 3-bit quantized version of [Qwen/Qwen3-32B](https://huggingface.co/Qwen/Qwen3-32B) using MLX. ## Model Details - **Quantization**: 3-bit - **Framework**: MLX - **Base Model**: Qwen/Qwen3-32B - **Model Size**: ~12GB (3-bit quantized) ## Usage ```python from mlx_lm import load, generate model, tokenizer = load("mlx-community/Qwen3-32B-3bit") prompt = "Hello, how are you?" messages = [{"role": "user", "content": prompt}] formatted_prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True) response = generate(model, tokenizer, prompt=formatted_prompt, max_tokens=100) print(response) ``` ## Requirements - Apple Silicon Mac (M1/M2/M3) - macOS 13.0+ - Python 3.8+ - MLX and mlx-lm packages ## Installation ```bash pip install mlx mlx-lm ```