VicFonch

new residual_diff_rfr.pth: Addition of the main .pth weights

6398d0b unverified 5 months ago

2.26 kB

metadata

tags:
  - video-frame-interpolation
  - diffusion-model
  - animation
  - uncertainty-estimation

🤖 Multi‑Input ResShift Diffusion VFI

⚙️ Setup

Start by downloading the source code directly from GitHub.

git clone https://github.com/VicFonch/Multi-Input-Resshift-Diffusion-VFI.git

Create a conda environment and install all the requirements

conda create -n multi-input-resshift python=3.10
conda activate multi-input-resshift
pip install -r requirements.txt

Note: Make sure your system is compatible with CUDA 12.4. If not, install CuPy according to your current CUDA version.

🚀 Inference Example

import os
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt

from torchvision.transforms import Compose, ToTensor, Resize, Normalize
from utils.utils import denorm
from model.hub import MultiInputResShiftHub

model = MultiInputResShiftHub.from_pretrained("vfontech/Multiple-Input-Resshift-VFI")
model.requires_grad_(False).cuda().eval()

img0_path = r"_data\example_images\frame1.png"
img2_path = r"_data\example_images\frame3.png"

transforms = Compose([
    Resize((256, 448)),
    ToTensor(),
    Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]),
])

img0 = transforms(Image.open(img0_path).convert("RGB")).unsqueeze(0).cuda()
img2 = transforms(Image.open(img2_path).convert("RGB")).unsqueeze(0).cuda()
tau = 0.5

img1 = model.reverse_process([img0, img2], tau)

plt.figure(figsize=(10, 5))
plt.subplot(1, 2, 1)
plt.imshow(denorm(img0, mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]).squeeze().permute(1, 2, 0).cpu().numpy())
plt.subplot(1, 2, 2)
plt.imshow(denorm(It, mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]).squeeze().permute(1, 2, 0).cpu().numpy())
plt.show()