Introduction to our ReasonFlux-Coders

We introduce ReasonFlux-Coders, trained with CURE, our algorithm for co-evolving an LLM's coding and unit test generation abilities.

ReasonFlux-Coder-7B and ReasonFlux-Coder-14B outperform similarly sized Qwen Coders, DeepSeek Coders, and Seed-Coders, and naturally integrate into common test-time scaling and agentic coding pipelines.
ReasonFlux-Coder-4B is our Long-CoT model, outperforming Qwen3-4B while achieving 64.8% efficiency in unit test generation. We have demonstrated its ability to serve as a reward model for training base models via reinforcement learning (see our paper).

Paper | Code

Citation

@article{wang2025cure,
  title={Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning},
  author={Wang, Yinjie and Yang, Ling and Tian, Ye and Shen, Ke and Wang, Mengdi},
  journal={arXiv preprint arXiv:2506.03136},
  year={2025}
}

Downloads last month: 332

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for Gen-Verse/ReasonFlux-Coder-7B

Quantizations

1 model

Collection including Gen-Verse/ReasonFlux-Coder-7B

ReasonFLux-Coder

Collection

Coding LLMs excel at both writing code and generating unit tests. • 9 items • Updated May 26, 2025 • 11

Paper for Gen-Verse/ReasonFlux-Coder-7B

Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning

Paper • 2506.03136 • Published Jun 3, 2025 • 25