GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning Paper • 2505.17022 • Published May 22 • 27
DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution Paper • 2411.02359 • Published Nov 4, 2024 • 13
Improving and Benchmarking Offline Reinforcement Learning Algorithms Paper • 2306.00972 • Published Jun 1, 2023
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6 • 178
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? Paper • 2504.13837 • Published Apr 18 • 132
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Paper • 2504.08685 • Published Apr 11 • 129
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation Paper • 2503.16430 • Published Mar 20 • 34
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation Paper • 2503.16430 • Published Mar 20 • 34
MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice Paper • 2503.05978 • Published Mar 7 • 36
Video Depth Anything: Consistent Depth Estimation for Super-Long Videos Paper • 2501.12375 • Published Jan 21 • 22
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos Paper • 2501.09781 • Published Jan 16 • 29
Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation Paper • 2412.14015 • Published Dec 18, 2024 • 12
How Far is Video Generation from World Model: A Physical Law Perspective Paper • 2411.02385 • Published Nov 4, 2024 • 35
Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing Paper • 2407.08770 • Published Jul 11, 2024 • 21
LLM-based Optimization of Compound AI Systems: A Survey Paper • 2410.16392 • Published Oct 21, 2024 • 16