Upload an image and ask questions about it
A leaderboard for multimodal models
Mark regions in images based on text descriptions