README.md · stepfun-ai/Qwen2.5-32B-DialogueReason at 1fefda4ceb6e20c85bba35883b9c56f09edff9b5

metadata

license: apache-2.0

Introduction

Qwen2.5-32B-DialogueReason is a dialogue-based reasoning model built on Qwen2.5-32B-Base.
We train the model using Open-Reasoner-Zero data through rule-based reinforcement learning.

🧠 Key Features

Qwen2.5-32B-Base as the foundation.
Use Rule-Based RL to achieve dialogue reasoning.
With dynamic agent initialization to adapt to various scenarios.
With flexible environment configuration to set up task-specific contexts.
With multi-turn dialogue reasoning to incrementally solve problems.