view article Article RynnEC: Bringing MLLMs into Embodied World By Alibaba-DAMO-Academy and 6 others • 9 days ago • 6
Towards Affordance-Aware Robotic Dexterous Grasping with Human-like Priors Paper • 2508.08896 • Published 11 days ago • 10
view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data By danaaubakirova and 8 others • Jun 3 • 232
view article Article RynnVLA-001: Using Human Demonstrations to Improve Robot Manipulation By Alibaba-DAMO-Academy and 9 others • 12 days ago • 25
VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning Paper • 2507.22607 • Published 24 days ago • 45
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization Paper • 2507.14683 • Published Jul 19 • 126
Through the Valley: Path to Effective Long CoT Training for Small Language Models Paper • 2506.07712 • Published Jun 9 • 18
Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning Paper • 2506.07044 • Published Jun 8 • 112
Satori-SWE: Evolutionary Test-Time Scaling for Sample-Efficient Software Engineering Paper • 2505.23604 • Published May 29 • 24
LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization Paper • 2502.13922 • Published Feb 19 • 28
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 416
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper • 2501.13106 • Published Jan 22 • 90
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM Paper • 2501.00599 • Published Dec 31, 2024 • 48
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper • 2501.00958 • Published Jan 1 • 107
M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework Paper • 2411.06176 • Published Nov 9, 2024 • 46
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 683
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss Paper • 2410.17243 • Published Oct 22, 2024 • 95
Mitigating Object Hallucination via Concentric Causal Attention Paper • 2410.15926 • Published Oct 21, 2024 • 17