MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems Paper • 2505.18943 • Published 17 days ago • 24 • 4
MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems Paper • 2505.18943 • Published 17 days ago • 24
MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems Paper • 2505.18943 • Published 17 days ago • 24 • 4
Agent-SafetyBench: Evaluating the Safety of LLM Agents Paper • 2412.14470 • Published Dec 19, 2024 • 13
MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems Paper • 2505.18943 • Published 17 days ago • 24
MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems Paper • 2505.18943 • Published 17 days ago • 24 • 4
Seeker: Towards Exception Safety Code Generation with Intermediate Language Agents Framework Paper • 2412.11713 • Published Dec 16, 2024 • 6
Seeker: Towards Exception Safety Code Generation with Intermediate Language Agents Framework Paper • 2412.11713 • Published Dec 16, 2024 • 6
EvoCodeBench: An Evolving Code Generation Benchmark with Domain-Specific Evaluations Paper • 2410.22821 • Published Oct 30, 2024 • 2
Seeker: Towards Exception Safety Code Generation with Intermediate Language Agents Framework Paper • 2412.11713 • Published Dec 16, 2024 • 6 • 2
Seeker: Enhancing Exception Handling in Code with LLM-based Multi-Agent Approach Paper • 2410.06949 • Published Oct 9, 2024 • 6
Seeker: Enhancing Exception Handling in Code with LLM-based Multi-Agent Approach Paper • 2410.06949 • Published Oct 9, 2024 • 6 • 3
Seeker: Enhancing Exception Handling in Code with LLM-based Multi-Agent Approach Paper • 2410.06949 • Published Oct 9, 2024 • 6
Seeker: Enhancing Exception Handling in Code with LLM-based Multi-Agent Approach Paper • 2410.06949 • Published Oct 9, 2024 • 6 • 3
DevEval: A Manually-Annotated Code Generation Benchmark Aligned with Real-World Code Repositories Paper • 2405.19856 • Published May 30, 2024 • 9
DevEval: Evaluating Code Generation in Practical Software Projects Paper • 2401.06401 • Published Jan 12, 2024
EvoCodeBench: An Evolving Code Generation Benchmark Aligned with Real-World Code Repositories Paper • 2404.00599 • Published Mar 31, 2024 • 1