LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers Paper β’ 2502.15007 β’ Published 20 days ago β’ 162
MLGym: A New Framework and Benchmark for Advancing AI Research Agents Paper β’ 2502.14499 β’ Published 21 days ago β’ 178
ReLearn: Unlearning via Learning for Large Language Models Paper β’ 2502.11190 β’ Published 25 days ago β’ 29
SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering? Paper β’ 2502.12115 β’ Published 23 days ago β’ 43
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU Paper β’ 2502.08910 β’ Published 28 days ago β’ 143
Auto-RT: Automatic Jailbreak Strategy Exploration for Red-Teaming Large Language Models Paper β’ 2501.01830 β’ Published Jan 3 β’ 18
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains Paper β’ 2501.05707 β’ Published Jan 10 β’ 20
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper β’ 2412.10360 β’ Published Dec 13, 2024 β’ 140