SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning Paper • 2506.08989 • Published 14 days ago • 14 • 2
Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models Paper • 2505.15406 • Published May 21 • 5 • 2
Evaluate Bias without Manual Test Sets: A Concept Representation Perspective for LLMs Paper • 2505.15524 • Published May 21 • 7 • 2
Word Form Matters: LLMs' Semantic Reconstruction under Typoglycemia Paper • 2503.01714 • Published Mar 3 • 4 • 2
Geolocation with Real Human Gameplay Data: A Large-Scale Dataset and Human-Like Reasoning Framework Paper • 2502.13759 • Published Feb 19 • 4 • 2
Injecting Domain-Specific Knowledge into Large Language Models: A Comprehensive Survey Paper • 2502.10708 • Published Feb 15 • 4 • 2