ByteMorph: Benchmarking Instruction-Guided Image Editing with Non-Rigid Motions Paper • 2506.03107 • Published Jun 3 • 1
VLM4D: Towards Spatiotemporal Awareness in Vision Language Models Paper • 2508.02095 • Published 19 days ago • 6
VLM4D: Towards Spatiotemporal Awareness in Vision Language Models Paper • 2508.02095 • Published 19 days ago • 6
VLM4D: Towards Spatiotemporal Awareness in Vision Language Models Paper • 2508.02095 • Published 19 days ago • 6 • 2
Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields Paper • 2503.20776 • Published Mar 26 • 8
Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields Paper • 2503.20776 • Published Mar 26 • 8