Training Software Engineering Agents and Verifiers with SWE-Gym Paper • 2412.21139 • Published Dec 30, 2024 • 24
Advancing LLM Reasoning Generalists with Preference Trees Paper • 2404.02078 • Published Apr 2, 2024 • 47
SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales Paper • 2405.20974 • Published May 31, 2024
A Single Transformer for Scalable Vision-Language Modeling Paper • 2407.06438 • Published Jul 8, 2024 • 1
OpenDevin: An Open Platform for AI Software Developers as Generalist Agents Paper • 2407.16741 • Published Jul 23, 2024 • 73
MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback Paper • 2309.10691 • Published Sep 19, 2023 • 4
CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets Paper • 2309.17428 • Published Sep 29, 2023 • 1
R-Tuning: Teaching Large Language Models to Refuse Unknown Questions Paper • 2311.09677 • Published Nov 16, 2023 • 3
If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents Paper • 2401.00812 • Published Jan 1, 2024 • 10