Process and answer questions about webpage videos
VLM-R1 model for Open-Vocabulary Object Detection
Highlight described objects in images
Open Agent Leaderboard
Upload images and ask questions to get answers
Generate text from images and videos