Factorized Learning for Temporally Grounded Video-Language Models
Paper
•
2512.24097
•
Published
•
6
None defined yet.
EVOLVE-VLA: Test-Time Training from Environment Feedback for Vision-Language-Action Models
X-Humanoid: Robotize Human Videos to Generate Humanoid Videos at Scale