IterPref: Focal Preference Learning for Code Generation via Iterative Debugging
Abstract
Preference learning enhances Code LLMs beyond supervised fine-tuning by leveraging relative quality comparisons. Existing methods construct preference pairs from candidates based on test case success, treating the higher pass rate sample as positive and the lower as negative. However, this approach does not pinpoint specific errors in the code, which prevents the model from learning more informative error correction patterns, as aligning failing code as a whole lacks the granularity needed to capture meaningful error-resolution relationships. To address these issues, we propose IterPref, a new preference alignment framework that mimics human iterative debugging to refine Code LLMs. IterPref explicitly locates error regions and aligns the corresponding tokens via a tailored DPO algorithm. To generate informative pairs, we introduce the CodeFlow dataset, where samples are iteratively refined until passing tests, with modifications capturing error corrections. Extensive experiments show that a diverse suite of Code LLMs equipped with IterPref achieves significant performance gains in code generation and improves on challenging tasks like BigCodeBench. In-depth analysis reveals that IterPref yields fewer errors. Our code and data will be made publicaly available.
Community
The code and data will be released soon!
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- RefineCoder: Iterative Improving of Large Language Models via Adaptive Critique Refinement for Code Generation (2025)
- Focused-DPO: Enhancing Code Generation Through Focused Preference Optimization on Error-Prone Points (2025)
- ACECODER: Acing Coder RL via Automated Test-Case Synthesis (2025)
- Learning to Solve and Verify: A Self-Play Framework for Code and Test Generation (2025)
- Process-Supervised Reinforcement Learning for Code Generation (2025)
- Learning to Generate Unit Tests for Automated Debugging (2025)
- CODESYNC: Synchronizing Large Language Models with Dynamic Code Evolution at Scale (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper