STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution

Code: https://github.com/NJU-PCALab/STAR

Paper: https://arxiv.org/abs/2501.02976

Project Page: https://nju-pcalab.github.io/projects/STAR

Demo Video: https://youtu.be/hx0zrql-SrU

βš™οΈ Dependencies and Installation

## git clone this repository
git clone https://github.com/NJU-PCALab/STAR.git
cd STAR

## create an environment
conda create -n star python=3.10
conda activate star
pip install -r requirements.txt
sudo apt-get update && apt-get install ffmpeg libsm6 libxext6  -y

πŸš€ Inference

Model Weight

Base Model Type URL
I2VGen-XL Light Degradation :link:
I2VGen-XL Heavy Degradation :link:
CogVideoX-5B Heavy Degradation :link:

1. I2VGen-XL-based

Step 1: Download the pretrained model STAR from HuggingFace.

We provide two verisions for I2VGen-XL-based model, heavy_deg.pt for heavy degraded videos and light_deg.pt for light degraded videos (e.g., the low-resolution video downloaded from video websites).

You can put the weight into pretrained_weight/.

Step 2: Prepare testing data

You can put the testing videos in the input/video/.

As for the prompt, there are three options: 1. No prompt. 2. Automatically generate a prompt using Pllava. 3. Manually write the prompt. You can put the txt file in the input/text/.

Step 3: Change the path

You need to change the paths in video_super_resolution/scripts/inference_sr.sh to your local corresponding paths, including video_folder_path, txt_file_path, model_path, and save_dir.

Step 4: Running inference command

bash video_super_resolution/scripts/inference_sr.sh

If you encounter an OOM problem, you can set a smaller frame_length in inference_sr.sh.

2. CogVideoX-based

Refer to these instructions for inference with the CogVideX-5B-based model.

Please note that the CogVideX-5B-based model supports only 720x480 input.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for SherryX/STAR

Base model

THUDM/CogVideoX-5b
Finetuned
(14)
this model

Dataset used to train SherryX/STAR

Space using SherryX/STAR 1