Try Orpheus TTS here
Generate realistic audio from text
Generate high-resolution images from text prompts