Convert and upload models to Hugging Face
Generate realistic audio from text
Audio to Talking Face
Generate depth video from input video
Generate detailed images from a prompt and an image