Audio Conditioned LipSync with Latent Diffusion Models
Extract garment images from everyday images!
With IP Adapter
Use GPU to fast video face swap
long-context vision-language understanding.
Convert screenshots to HTML code
Generate virtual try-on images for clothing