DistilQwen
Collection
All DistilQwen models and datasets
•
22 items
•
Updated
•
3
Model | AIME2024 | MATH500 | GPQA Diamond | LiveCodeBench V2 | Avg. |
---|---|---|---|---|---|
DistillQwen-ThoughtY-32B | 90.0 | 95.2 | 63.6 | 76.3 | 81.3 |
Qwen3-32B (thinking) | 76.7 | 94.8 | 65.7 | 72.2 | 77.3 |
DistillQwen-ThoughtX-32B | 80.0 | 92.6 | 64.0 | 73.4 | 77.5 |
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"alibaba-pai/DistillQwen-ThoughtY-4B",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "Solve ∫x e^x dx. Show your reasoning step-by-step."
messages = [
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=True # Switches between thinking and non-thinking modes. Default is True.
)
inputs = tokenizer([text], return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=32768)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
For more detailed information about the model, we encourage you to refer to our paper:
You can cite the paper using the following citation format:
@misc{cai2025reasoningomnithoughtlargecot,
title={Reasoning with OmniThought: A Large CoT Dataset with Verbosity and Cognitive Difficulty Annotations},
author={Wenrui Cai and Chengyu Wang and Junbing Yan and Jun Huang and Xiangzhong Fang},
year={2025},
eprint={2505.10937},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2505.10937}
}