Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper • 2506.01939 • Published Jun 2 • 170
YuE: Scaling Open Foundation Models for Long-Form Music Generation Paper • 2503.08638 • Published Mar 11 • 68
HybridNorm: Towards Stable and Efficient Transformer Training via Hybrid Normalization Paper • 2503.04598 • Published Mar 6 • 20
Scale-Distribution Decoupling: Enabling Stable and Effective Training of Large Language Models Paper • 2502.15499 • Published Feb 21 • 14
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published Feb 20 • 105
Towards a Unified View of Preference Learning for Large Language Models: A Survey Paper • 2409.02795 • Published Sep 4, 2024 • 74
FuzzCoder: Byte-level Fuzzing Test via Large Language Model Paper • 2409.01944 • Published Sep 3, 2024 • 46
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering Paper • 2408.09174 • Published Aug 17, 2024 • 53