Guidance in the Frequency Domain Enables High-Fidelity Sampling at Low CFG Scales Paper • 2506.19713 • Published 3 days ago • 12
GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning Paper • 2506.16141 • Published 9 days ago • 25
Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition Paper • 2506.17201 • Published 7 days ago • 43
DreamCube: 3D Panorama Generation via Multi-plane Synchronization Paper • 2506.17206 • Published 7 days ago • 19
VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning Paper • 2506.09049 • Published 17 days ago • 32
GenRecal: Generation after Recalibration from Large to Small Vision-Language Models Paper • 2506.15681 • Published 9 days ago • 36
Stream-Omni: Simultaneous Multimodal Interactions with Large Language-Vision-Speech Model Paper • 2506.13642 • Published 11 days ago • 26
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs Paper • 2506.14429 • Published 11 days ago • 43
Ambient Diffusion Omni: Training Good Models with Bad Data Paper • 2506.10038 • Published 17 days ago • 9
Align Your Flow: Scaling Continuous-Time Flow Map Distillation Paper • 2506.14603 • Published 10 days ago • 18
BridgeVLA: Input-Output Alignment for Efficient 3D Manipulation Learning with Vision-Language Models Paper • 2506.07961 • Published 18 days ago • 11
Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning Paper • 2506.10521 • Published 16 days ago • 65