Transcribe audio or YouTube videos into text
Generate images from text prompts
Analyze images to detect human poses