what's the base model?

#10
by XinglinZhao - opened

Thanks for sharing! is the base model a qwen-3 32B? or it's a more specialized model on deep research?

Alibaba-NLP org

The base model is Qwen3-30B-A3B-Thinking-2507.

callanwu changed discussion status to closed
Alibaba-NLP org

We then ran CPT, SFT, and RL to specialize it for deep-research tasks.

We tested it on our own dataset of 1,000 prompts focusing on action interpretation and sequencing. This is not a public benchmark.

Under the same parameter settings, the performance is Qwen3-Next-80B-A3B > Qwen3-30B-A3B > Tongyi-DeepResearch-30B-A3B.

vLLM offline inference without quantization: temperature=0.6, top_p=0.95, top_k=20, repetition_penalty=1.0, presence_penalty=0.0, seed=42

Is this performance trend common across other tasks?

The base model is Qwen3-30B-A3B-Thinking-2507.

The paper https://arxiv.org/pdf/2509.13310 says otherwise:

"Starting from Qwen’s pre-trained foundation models (e.g., Qwen3-30B-A3B-Base), our enhanced training pipeline consists of"

So just wanted to confirm which one is it? Thanks for the help!

Sign up or log in to comment