Post
1927
Ovi - Generate Videos With Audio Like VEO 3 or SORA 2 - Run Locally - Open Source for Free
Download and install : https://www.patreon.com/posts/140393220
Quick demo tutorial : https://youtu.be/uE0QabiHmRw
Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation
Project page : https://aaxwaz.github.io/Ovi/
SECourses Ovi Pro Premium App Features
Full scale ultra advanced app for Ovi - an open source project that can generate videos from both text prompts and image + text prompts with real audio.
Project page is here : https://aaxwaz.github.io/Ovi/
I have developed an ultra advanced Gradio app and much better pipeline that fully supports block swapping
Now we can generate full quality videos with as low as 8.2 GB VRAM
Hopefully I will work on dynamic on load FP8_Scaled tomorrow to improve VRAM even further
So more VRAM optimizations will come hopefully tomorrow
Our implemented block swapping is the very best one out there - I took the approach from famous Kohya Musubi tuner
The 1-click installer will install into Python 3.10.11 venv and will auto download models as well so it is literally 1-click
My installer auto installs with Torch 2.8, CUDA 12.9, Flash Attention 2.8.3 and it supports literally all GPUs like RTX 3000 series, 4000 series, 5000 series, H100, B200, etc
All generations will be saved inside outputs folder and we support so many features like batch folder processing, number of generations, full preset save and load
This is a rush release (in less than a day) so there can be errors please let me know and I will hopefully improve the app
Look the examples to understand how to prompt the model that is extremely important
RTX 5090 can run it without any block swap with just cpu-offloading - really fast
Download and install : https://www.patreon.com/posts/140393220
Quick demo tutorial : https://youtu.be/uE0QabiHmRw
Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation
Project page : https://aaxwaz.github.io/Ovi/
SECourses Ovi Pro Premium App Features
Full scale ultra advanced app for Ovi - an open source project that can generate videos from both text prompts and image + text prompts with real audio.
Project page is here : https://aaxwaz.github.io/Ovi/
I have developed an ultra advanced Gradio app and much better pipeline that fully supports block swapping
Now we can generate full quality videos with as low as 8.2 GB VRAM
Hopefully I will work on dynamic on load FP8_Scaled tomorrow to improve VRAM even further
So more VRAM optimizations will come hopefully tomorrow
Our implemented block swapping is the very best one out there - I took the approach from famous Kohya Musubi tuner
The 1-click installer will install into Python 3.10.11 venv and will auto download models as well so it is literally 1-click
My installer auto installs with Torch 2.8, CUDA 12.9, Flash Attention 2.8.3 and it supports literally all GPUs like RTX 3000 series, 4000 series, 5000 series, H100, B200, etc
All generations will be saved inside outputs folder and we support so many features like batch folder processing, number of generations, full preset save and load
This is a rush release (in less than a day) so there can be errors please let me know and I will hopefully improve the app
Look the examples to understand how to prompt the model that is extremely important
RTX 5090 can run it without any block swap with just cpu-offloading - really fast