YiXin-Distill-Qwen-72B-AWQ

Model Overview

YiXin-Distill-Qwen-72B: A High-Performance Distilled Model for Mathematical and General Reasoning, derived from Qwen2.5-72B using reinforcement learning. It is specifically optimized for mathematical reasoning and general knowledge tasks. Leveraging advanced distillation techniques, this model enhances reasoning capabilities while maintaining computational efficiency. Built upon the robust Qwen model foundation, it aims to achieve state-of-the-art performance across various benchmark evaluations.Our benchmark evaluations demonstrate that YiXin-Distill-Qwen-72B delivers strong performance, showing improvements over comparable distilled models in key mathematical and general reasoning tasks, with observed average improvements of 5 to 11 percentage points.

Run locally with vllm

For instance, you can easily start a service using vLLM:

vllm serve /models/YiXin-Distill-Qwen-72B-AWQ --tensor-parallel-size 2  --quantization=awq

Then you can access the Chat API by:

curl http://localhost:8000/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
    "model": "YiXin-AILab/YiXin-Distill-Qwen-72B-AWQ",
    "messages": [
        {"role": "system", "content": "You are a helpful and harmless assistant. You are Qwen developed by Alibaba. You should think step-by-step."},
        {"role": "user", "content": "8+8=?"}
    ]
    }'

Citation

If you use YiXin-Distill-Qwen-72B-AWQ in your research, please cite this work appropriately:

@article{yixin2025,
  title={YiXin-Distill-Qwen-72B-AWQ: A High-Performance Distilled Model for Mathematical and General Reasoning},
  author={Your Name},
  year={2025},
  journal={Preprint}
}

Acknowledgments

We acknowledge the contributions of the open-source community and researchers who have developed and maintained the Qwen and DeepSeek models. Their work has significantly advanced the field of large language model distillation and reasoning capabilities.