Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows Paper β’ 2411.07763 β’ Published Nov 12, 2024
When Attention Sink Emerges in Language Models: An Empirical View Paper β’ 2410.10781 β’ Published Oct 14, 2024
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs Paper β’ 2502.12982 β’ Published Feb 18 β’ 15
Predictive Data Selection: The Data That Predicts Is the Data That Teaches Paper β’ 2503.00808 β’ Published 26 days ago β’ 55
Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models Paper β’ 2503.18923 β’ Published 3 days ago β’ 11
Long Context is Not Long at All: A Prospector of Long-Dependency Data for Large Language Models Paper β’ 2405.17915 β’ Published May 28, 2024 β’ 2
DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception Paper β’ 2405.15232 β’ Published May 24, 2024 β’ 2
AgentCourt: Simulating Court with Adversarial Evolvable Lawyer Agents Paper β’ 2408.08089 β’ Published Aug 15, 2024
Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models Paper β’ 2409.18943 β’ Published Sep 27, 2024 β’ 29
A Comprehensive Survey on Long Context Language Modeling Paper β’ 2503.17407 β’ Published 7 days ago β’ 45
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild Paper β’ 2503.18892 β’ Published 3 days ago β’ 26
SkyLadder: Better and Faster Pretraining via Context Window Scheduling Paper β’ 2503.15450 β’ Published 8 days ago β’ 11