Panacea-MegaScience-Qwen3-1.7B-q4-hi-mlx
Top Quantizations
✅ Recommended High-Performers
🥇 q5 Quantization
Why: Highest winogrande (0.694 vs avg. 0.574), excellent ARC-Easy (avg. ~0.398).
Strength: Best balance of accuracy and robustness across tasks (especially winogrande & ARC-Easy).
Ideal for: Production deployments needing top end-to-end accuracy.
🥈 q6-hi Quantization
Why: Best ARC-Easy (0.398) + near-best winogrande (0.696).
Strength: Great precision for ARC tasks with minimal loss in boolq (0.622).
Ideal for: ARC-focused QA tasks or mixed-training pipelines.
🥉 q4-hi Quantization
Why: Best boolq (0.622 = tied with q5/q6), competitive hellaswag.
Strength: Lightweight optimization for quick inference on boolq/data-centric tasks.
Performance Insights
Winogrande Champion: q5 (0.694) → optimal for complex reasoning tasks.
Consistency King: q5 + q6-hi (both hit >90% of max winogrande scores).
Surprise: bf16 underperforms slightly on winogrande despite high ARC-Easy → good for baseline testing.
Cost-Saver: q4-hi (best boolq) but with minimal setup overhead.
Recommendation Summary
Use Case Top Quant Key Advantage
Highest winogrande accuracy q5 +26% vs bf16 (0.694 → 0.550)
ARC-Easy focus q6-hi Highest ARC-Easy (0.398)
BoolQ-centric workflows q4-hi Best boolq (0.622)
Balanced end-to-end q5 Best holistic median score
💡 Pro Tip: If latency is critical, deploy q5 for accuracy and q4-hi as a backup (minimal trade-off in boolq + wins on winogrande).
Visual Summary
Winogrande (↑) → q5 🥇 = 0.694
ARC-Easy (↑) → q6-hi 🥈 = 0.398
BoolQ (↑) → q4-hi 🥉 = 0.622
Consistency → q5/q6-hi (★★★★☆)
This model Panacea-MegaScience-Qwen3-1.7B-q4-hi-mlx was converted to MLX format from prithivMLmods/Panacea-MegaScience-Qwen3-1.7B using mlx-lm version 0.26.3.
Use with mlx
pip install mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("Panacea-MegaScience-Qwen3-1.7B-q4-hi-mlx")
prompt = "hello"
if tokenizer.chat_template is not None:
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
messages, add_generation_prompt=True
)
response = generate(model, tokenizer, prompt=prompt, verbose=True)
- Downloads last month
- 12
Model tree for nightmedia/Panacea-MegaScience-Qwen3-1.7B-q4-hi-mlx
Base model
Qwen/Qwen3-1.7B-Base
Finetuned
Qwen/Qwen3-1.7B