Safetensors
qwen2

DistilQwen-ThoughtX: Optimized Reasoning Models with OmniThought

DistilQwen-ThoughtX is a series of high-performance reasoning models trained on the OmniThought dataset. These models are optimized for chain-of-thought (CoT) reasoning with balanced verbosity and cognitive difficulty, achieving state-of-the-art results on mathematical, coding, and logical reasoning benchmarks.


Model Variants

Model Name Parameters Base Model Hugging Face Link
DistilQwen-ThoughtX-7B 7B Qwen2.5-7B-Instruct Link
DistilQwen-ThoughtX-32B 32B Qwen2.5-32B-Instruct Link

Key Features

  1. Optimal Reasoning Verbosity (RV):
    CoT processes are filtered to avoid overthinking (excessive steps) or under-reasoning, improving efficiency and accuracy.

  2. Cognitive Difficulty (CD) Alignment:
    CoTs are selected to match the model's capacity, ensuring smaller models learn simpler reasoning paths while larger models handle complex logic.

  3. Performance:
    Outperforms existing open-source reasoning models (e.g., DeepSeek-R1-Distill, OpenThinker) on benchmarks like AIME2024, MATH500, and LiveCodeBench V2.


Usage

Inference Example

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "alibaba-pai/DistilQwen-ThoughtX-7B"  # or 32B
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")

prompt = "Solve ∫x e^x dx. Show your reasoning step-by-step."
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Data: OmniThought

The models are trained on the OmniThought dataset, which includes:

  • 2 million CoT processes with RV and CD annotations.
  • Coverage of mathematics, coding, and logical reasoning tasks.
  • Validated by multiple teacher models (DeepSeek-R1, QwQ-32B).

Benchmarks

Model AIME2024 MATH500 GPQA-D LiveCodeBench V2
DeepSeek-R1-Distill-7B 57.3 89.6 47.3 48.4
DistilQwen-ThoughtX-7B 56.7 90.2 50.0 56.8
DeepSeek-R1-Distill-32B 74.7 90.0 62.4 72.3
DistilQwen-ThoughtX-32B 80.0 92.6 64.0 73.4

Reference

For more detailed information about the model, we encourage you to refer to our paper:

  • Reasoning with OmniThought: A Large CoT Dataset with Verbosity and Cognitive Difficulty Annotations
    Wenrui Cai, Chengyu Wang, Junbing Yan, Jun Huang, Xiangzhong Fang arXiv:2505.10937

You can cite the paper using the following citation format:

@misc{cai2025reasoningomnithoughtlargecot,
      title={Reasoning with OmniThought: A Large CoT Dataset with Verbosity and Cognitive Difficulty Annotations}, 
      author={Wenrui Cai and Chengyu Wang and Junbing Yan and Jun Huang and Xiangzhong Fang},
      year={2025},
      eprint={2505.10937},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.10937} 
}
Downloads last month
541
Safetensors
Model size
7.62B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for alibaba-pai/DistilQwen-ThoughtX-7B

Quantizations
1 model

Collection including alibaba-pai/DistilQwen-ThoughtX-7B