InfiniteTalk

InfiniteTalk: Audio-driven Video Generation for Sparse-Frame Video Dubbing

InfiniteTalk Website InfiniteTalk Paper on arXiv Github

We propose InfiniteTalk​​, a novel sparse-frame video dubbing framework. Given an input video and audio track, InfiniteTalk synthesizes a new video with ​​accurate lip synchronization​​ while ​​simultaneously aligning head movements, body posture, and facial expressions​​ with the audio. Unlike traditional dubbing methods that focus solely on lips, InfiniteTalk enables ​​infinite-length video generation​​ with accurate lip synchronization and consistent identity preservation. Beside, InfiniteTalk can also be used as an image-audio-to-video model with an image and an audio as input.

  • 💬 ​​Sparse-frame Video Dubbing​​ – Synchronizes not only lips, but aslo head, body, and expressions
  • ⏱️ ​​Infinite-Length Generation​​ – Supports unlimited video duration
  • ✨ ​​Stability​​ – Reduces hand/body distortions compared to MultiTalk
  • 🚀 ​​Lip Accuracy​​ – Achieves superior lip synchronization to MultiTalk

This repository hosts the model weights for InfiniteTalk. For installation, usage instructions, and further documentation, please visit our GitHub repository.

License Agreement

The models in this repository are licensed under the Apache 2.0 License. We claim no rights over the your generated contents, granting you the freedom to use them while ensuring that your usage complies with the provisions of this license. You are fully accountable for your use of the models, which must not involve sharing any content that violates applicable laws, causes harm to individuals or groups, disseminates personal information intended for harm, spreads misinformation, or targets vulnerable populations.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 2 Ask for provider support

Model tree for MeiGen-AI/InfiniteTalk

Quantizations
1 model