Implicit Reasoning in Transformers is Reasoning through Shortcuts Paper • 2503.07604 • Published 18 days ago • 21
LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization Paper • 2503.08619 • Published 17 days ago • 20
Gemini Embedding: Generalizable Embeddings from Gemini Paper • 2503.07891 • Published 18 days ago • 34
UniF^2ace: Fine-grained Face Understanding and Generation with Unified Multimodal Models Paper • 2503.08120 • Published 18 days ago • 30
MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice Paper • 2503.05978 • Published 21 days ago • 34
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL Paper • 2503.07536 • Published 18 days ago • 80
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia Paper • 2503.07920 • Published 18 days ago • 95
Self-Taught Self-Correction for Small Language Models Paper • 2503.08681 • Published 17 days ago • 13
GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training Paper • 2503.08525 • Published 17 days ago • 15
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning Paper • 2503.09516 • Published 16 days ago • 27
On the Limitations of Vision-Language Models in Understanding Image Transforms Paper • 2503.09837 • Published 16 days ago • 10