CodeEditorBench: Evaluating Code Editing Capability of Large Language Models Paper • 2404.03543 • Published Apr 4, 2024 • 16
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence Paper • 2406.11931 • Published Jun 17, 2024 • 63
AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents Paper • 2407.18901 • Published Jul 26, 2024 • 33
Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents Paper • 2408.07060 • Published Aug 13, 2024 • 42
SWE-bench-java: A GitHub Issue Resolving Benchmark for Java Paper • 2408.14354 • Published Aug 26, 2024 • 41
FuzzCoder: Byte-level Fuzzing Test via Large Language Model Paper • 2409.01944 • Published Sep 3, 2024 • 45
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at Scale Paper • 2409.16299 • Published Sep 9, 2024 • 12
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings Paper • 2501.01257 • Published Jan 2 • 49
HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation Paper • 2412.21199 • Published Dec 30, 2024 • 14
Outcome-Refining Process Supervision for Code Generation Paper • 2412.15118 • Published Dec 19, 2024 • 19
CodeDPO: Aligning Code Models with Self Generated and Verified Source Code Paper • 2410.05605 • Published Oct 8, 2024 • 1
Enhancing LLM Agents for Code Generation with Possibility and Pass-rate Prioritized Experience Replay Paper • 2410.12236 • Published Oct 16, 2024 • 1
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models Paper • 2411.04905 • Published Nov 7, 2024 • 115
SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution Paper • 2501.05040 • Published Jan 9 • 15
ACECODER: Acing Coder RL via Automated Test-Case Synthesis Paper • 2502.01718 • Published 18 days ago • 28
Large Language Model Guided Self-Debugging Code Generation Paper • 2502.02928 • Published 17 days ago • 11
CodeSteer: Symbolic-Augmented Language Models via Code/Text Guidance Paper • 2502.04350 • Published 17 days ago • 11
CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging Paper • 2502.05664 • Published 13 days ago • 22