Sapiens Collection Foundation models for human tasks. Code: https://github.com/facebookresearch/sapiens • 72 items • Updated Sep 18, 2024 • 60
LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training Paper • 2509.23661 • Published Sep 28 • 44
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model Paper • 2502.10248 • Published Feb 14 • 55