what's the base model?
Thanks for sharing! is the base model a qwen-3 32B? or it's a more specialized model on deep research?
We then ran CPT, SFT, and RL to specialize it for deep-research tasks.
We tested it on our own dataset of 1,000 prompts focusing on action interpretation and sequencing. This is not a public benchmark.
Under the same parameter settings, the performance is Qwen3-Next-80B-A3B > Qwen3-30B-A3B > Tongyi-DeepResearch-30B-A3B.
vLLM offline inference without quantization: temperature=0.6, top_p=0.95, top_k=20, repetition_penalty=1.0, presence_penalty=0.0, seed=42
Is this performance trend common across other tasks?
The base model is Qwen3-30B-A3B-Thinking-2507.
The paper https://arxiv.org/pdf/2509.13310 says otherwise:
"Starting from Qwen’s pre-trained foundation models (e.g., Qwen3-30B-A3B-Base), our enhanced training pipeline consists of"
So just wanted to confirm which one is it? Thanks for the help!