MultiRef: Controllable Image Generation with Multiple Visual References Paper • 2508.06905 • Published 15 days ago • 19
Are We on the Right Way for Assessing Document Retrieval-Augmented Generation? Paper • 2508.03644 • Published 18 days ago • 24
Janus Collection Janus is a novel autoregressive framework that unifies multimodal understanding and generation. • 8 items • Updated Feb 18 • 17
view article Article Introducing the Chatbot Guardrails Arena By sonalipnaik and 3 others • Mar 21, 2024 • 5
SurveyX: Academic Survey Automation via Large Language Models Paper • 2502.14776 • Published Feb 20 • 101
The Differences Between Direct Alignment Algorithms are a Blur Paper • 2502.01237 • Published Feb 3 • 115
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models Paper • 2502.01061 • Published Feb 3 • 222
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published Jan 14 • 298
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 283
Perception Tokens Enhance Visual Reasoning in Multimodal Language Models Paper • 2412.03548 • Published Dec 4, 2024 • 17
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery Paper • 2410.05080 • Published Oct 7, 2024 • 21
Training Language Models to Self-Correct via Reinforcement Learning Paper • 2409.12917 • Published Sep 19, 2024 • 141
Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique Paper • 2408.10701 • Published Aug 20, 2024 • 12
Out-of-Distribution Detection with Attention Head Masking for Multimodal Document Classification Paper • 2408.11237 • Published Aug 20, 2024 • 6