DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 8 days ago • 270
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos Paper • 2501.09781 • Published 14 days ago • 24
Do generative video models learn physical principles from watching videos? Paper • 2501.09038 • Published 16 days ago • 31
Textoon: Generating Vivid 2D Cartoon Characters from Text Descriptions Paper • 2501.10020 • Published 13 days ago • 22
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps Paper • 2501.09732 • Published 14 days ago • 66
RepVideo: Rethinking Cross-Layer Representation for Video Generation Paper • 2501.08994 • Published 15 days ago • 15
Image / Video Gen Collection Image Generation Using Diffusion-Based Methods: Tips and Techniques for Stable Diffusion • 33 items • Updated 15 days ago • 7
The Lessons of Developing Process Reward Models in Mathematical Reasoning Paper • 2501.07301 • Published 17 days ago • 89
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 16 days ago • 271
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs Paper • 2501.06186 • Published 20 days ago • 59
Multimodal Language Model Collection What does matter besides data receipt when training a Multimodal language model? • 29 items • Updated 22 days ago • 1
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos Paper • 2501.04001 • Published 23 days ago • 42