Detect objects in images
Analyze and generate text from images and text inputs
Generate text, answer questions, translate, and detect objects