lmms-lab/LLaVA-OneVision-1.5-8B-Instruct Image-Text-to-Text • 9B • Updated 2 days ago • 1.22k • 30
lmms-lab/LLaVA-OneVision-1.5-Insturct-Data Viewer • Updated about 12 hours ago • 19.3M • 27.3k • 25
lmms-lab/LLaVA-One-Vision-1.5-Mid-Training-85M Viewer • Updated about 13 hours ago • 40.5M • 24.5k • 23
Collaborative Multi-Modal Coding for High-Quality 3D Generation Paper • 2508.15228 • Published Aug 21 • 4
EgoTwin: Dreaming Body and View in First Person Paper • 2508.13013 • Published Aug 18 • 20 • 2
Has GPT-5 Achieved Spatial Intelligence? An Empirical Study Paper • 2508.13142 • Published Aug 18 • 33
4DNeX: Feed-Forward 4D Generative Modeling Made Easy Paper • 2508.13154 • Published Aug 18 • 60
Cut2Next: Generating Next Shot via In-Context Tuning Paper • 2508.08244 • Published Aug 11 • 13
DPoser-X: Diffusion Model as Robust 3D Whole-body Human Pose Prior Paper • 2508.00599 • Published Aug 1 • 6
Hi3DEval: Advancing 3D Generation Evaluation with Hierarchical Validity Paper • 2508.05609 • Published Aug 7 • 29
LongVie: Multimodal-Guided Controllable Ultra-Long Video Generation Paper • 2508.03694 • Published Aug 5 • 50