Submit media inputs to generate text and speech responses
Gemini 2.0 native image generation co-doodling