vfontech/Multi-Input-Res-Diffusion-VFI · Apply for community grant: Academic project (gpu)

Owner 15 days ago

•

We introduce a novel diffusion-based method for video frame interpolation (VFI) specifically designed for hand-drawn animation. Our model, Multi-Input ResShift Diffusion VFI, explicitly integrates and re-estimates interpolation time during training to handle the large temporal variability found in animated content. It generalizes the ResShift diffusion scheme (originally proposed for super-resolution) for efficient generation with very few steps (~10). Our Gradio app generates an intermediate frame or a full video sequence of interpolated frames—given two input frames—depending on the number of samples selected. Currently, the model relies on custom CUDA code written with CuPy for performance, so the app wont work on CPU free hardware.

For more details, you can see the following links:

arXiv paper: https://arxiv.org/pdf/2504.05402
GitHub repo: https://github.com/VicFonch/Multi-Input-Resshift-Diffusion-VFI

hysts

15 days ago

Hi @vfontech , we've assigned ZeroGPU to this Space. Please check the compatibility and usage sections of this page so your Space can run on ZeroGPU.

hysts

15 days ago

Sorry, I switch the hardware to L4 because cupy doesn't work with ZeroGPU.

vfontech

Owner 15 days ago

•

edited 15 days ago

Thank you so much for the support and the hardware. L4 works wonders for the model! I really appreciate it.