OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
Abstract
The introduction of large language models has significantly advanced code generation. However, open-source models often lack the execution capabilities and iterative refinement of advanced systems like the GPT-4 Code Interpreter. To address this, we introduce OpenCodeInterpreter, a family of open-source code systems designed for generating, executing, and iteratively refining code. Supported by Code-Feedback, a dataset featuring 68K multi-turn interactions, OpenCodeInterpreter integrates execution and human feedback for dynamic code refinement. Our comprehensive evaluation of OpenCodeInterpreter across key benchmarks such as HumanEval, MBPP, and their enhanced versions from EvalPlus reveals its exceptional performance. Notably, OpenCodeInterpreter-33B achieves an accuracy of 83.2 (76.4) on the average (and plus versions) of HumanEval and MBPP, closely rivaling GPT-4's 84.2 (76.2) and further elevates to 91.6 (84.6) with synthesized human feedback from GPT-4. OpenCodeInterpreter brings the gap between open-source code generation models and proprietary systems like GPT-4 Code Interpreter.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- DeepSeek-Coder: When the Large Language Model Meets Programming - The Rise of Code Intelligence (2024)
- DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning (2024)
- MoTCoder: Elevating Large Language Models with Modular of Thought for Challenging Programming Tasks (2023)
- DebugBench: Evaluating Debugging Capability of Large Language Models (2024)
- EffiBench: Benchmarking the Efficiency of Automatically Generated Code (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
ΠΏΡΠΈΠ²Π΅Ρ