Rethinking Diverse Human Preference Learning through Principal Component Analysis Paper • 2502.13131 • Published 3 days ago • 34
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published 5 days ago • 129
Learning Getting-Up Policies for Real-World Humanoid Robots Paper • 2502.12152 • Published 4 days ago • 35
Multimodal Mamba: Decoder-only Multimodal State Space Model via Quadratic to Linear Distillation Paper • 2502.13145 • Published 3 days ago • 35
SafeRoute: Adaptive Model Selection for Efficient and Accurate Safety Guardrails in Large Language Models Paper • 2502.12464 • Published 4 days ago • 26
Soundwave: Less is More for Speech-Text Alignment in LLMs Paper • 2502.12900 • Published 3 days ago • 73
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published 9 days ago • 180
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU Paper • 2502.08910 • Published 9 days ago • 139
Expect the Unexpected: FailSafe Long Context QA for Finance Paper • 2502.06329 • Published 11 days ago • 123
Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models Paper • 2502.04404 • Published 15 days ago • 20
QuEST: Stable Training of LLMs with 1-Bit Weights and Activations Paper • 2502.05003 • Published 14 days ago • 41
GuardReasoner: Towards Reasoning-based LLM Safeguards Paper • 2501.18492 • Published 22 days ago • 81
People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text Paper • 2501.15654 • Published 26 days ago • 11
Early External Safety Testing of OpenAI's o3-mini: Insights from the Pre-Deployment Evaluation Paper • 2501.17749 • Published 23 days ago • 13
Mixture-of-Mamba: Enhancing Multi-Modal State-Space Models with Modality-Aware Sparsity Paper • 2501.16295 • Published 25 days ago • 8