GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents Paper • 2505.23671 • Published May 29 • 3
RAFT: Adapting Language Model to Domain Specific RAG Paper • 2403.10131 • Published Mar 15, 2024 • 72
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code Paper • 2403.07974 • Published Mar 12, 2024 • 3
What's in a Name? Are BERT Named Entity Representations just as Good for any other Name? Paper • 2007.06897 • Published Jul 14, 2020
LLM-Assisted Code Cleaning For Training Accurate Code Generators Paper • 2311.14904 • Published Nov 25, 2023 • 5