
VAST-AI/MIDI-3D
Image-to-3D
โข
Updated
โข
76
โข
48
Generate text and speech from audio, video, and text inputs
Remove background from ID photos
Generate images from text prompts
Large Animatable Human Model
Try Orpheus TTS here
Convert voice to match another's style or tone
Audio Conditioned LipSync with Latent Diffusion Models