Collections
Discover the best community collections!
Collections including paper arxiv:2506.01844
-
Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control
Paper • 2506.01943 • Published • 24 -
LoHoVLA: A Unified Vision-Language-Action Model for Long-Horizon Embodied Tasks
Paper • 2506.00411 • Published • 30 -
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics
Paper • 2506.01844 • Published • 90
-
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
Paper • 2503.14734 • Published • 3 -
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation
Paper • 2401.02117 • Published • 34 -
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics
Paper • 2506.01844 • Published • 90
-
DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning
Paper • 2504.07128 • Published • 85 -
Byte Latent Transformer: Patches Scale Better Than Tokens
Paper • 2412.09871 • Published • 104 -
BitNet b1.58 2B4T Technical Report
Paper • 2504.12285 • Published • 73 -
FAST: Efficient Action Tokenization for Vision-Language-Action Models
Paper • 2501.09747 • Published • 24
-
Gemini Robotics: Bringing AI into the Physical World
Paper • 2503.20020 • Published • 26 -
Magma: A Foundation Model for Multimodal AI Agents
Paper • 2502.13130 • Published • 58 -
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
Paper • 2311.05437 • Published • 51 -
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
Paper • 2410.23218 • Published • 51
-
GRUtopia: Dream General Robots in a City at Scale
Paper • 2407.10943 • Published • 26 -
Make-An-Agent: A Generalizable Policy Network Generator with Behavior-Prompted Diffusion
Paper • 2407.10973 • Published • 11 -
Cross Anything: General Quadruped Robot Navigation through Complex Terrains
Paper • 2407.16412 • Published • 6 -
RP1M: A Large-Scale Motion Dataset for Piano Playing with Bi-Manual Dexterous Robot Hands
Paper • 2408.11048 • Published • 4