Add pipeline tag and library name
Browse filesThis PR improves the model card by adding the pipeline tag and library name, ensuring the model can be found in the Hugging Face model hub.
README.md
CHANGED
@@ -2,7 +2,10 @@
|
|
2 |
license: other
|
3 |
license_name: skywork-license
|
4 |
license_link: LICENSE
|
|
|
|
|
5 |
---
|
|
|
6 |
<p align="center">
|
7 |
<img src="assets/logo2.png" alt="SkyReels Logo" width="50%">
|
8 |
</p>
|
@@ -273,7 +276,7 @@ torchrun --nproc_per_node=2 generate_video_df.py \
|
|
273 |
--base_num_frames 97 \
|
274 |
--num_frames 257 \
|
275 |
--overlap_history 17 \
|
276 |
-
--prompt "A
|
277 |
--use_usp \
|
278 |
--offload \
|
279 |
--seed 42
|
@@ -602,85 +605,4 @@ The evaluation demonstrates that our model achieves significant advancements in
|
|
602 |
<td>3.21</td>
|
603 |
<td>3.18</td>
|
604 |
<td>2.93</td>
|
605 |
-
</tr>
|
606 |
-
<tr>
|
607 |
-
<td>SkyReels-V2-I2V</td>
|
608 |
-
<td>3.29</td>
|
609 |
-
<td>3.42</td>
|
610 |
-
<td>3.18</td>
|
611 |
-
<td>3.56</td>
|
612 |
-
<td>3.01</td>
|
613 |
-
</tr>
|
614 |
-
</tbody>
|
615 |
-
</table>
|
616 |
-
</p>
|
617 |
-
|
618 |
-
Our results demonstrate that both **SkyReels-V2-I2V (3.29)** and **SkyReels-V2-DF (3.24)** achieve state-of-the-art performance among open-source models, significantly outperforming HunyuanVideo-13B (2.84) and Wan2.1-14B (2.85) across all quality dimensions. With an average score of 3.29, SkyReels-V2-I2V demonstrates comparable performance to proprietary models Kling-1.6 (3.4) and Runway-Gen4 (3.39).
|
619 |
-
|
620 |
-
|
621 |
-
#### VBench
|
622 |
-
To objectively compare SkyReels-V2 Model against other leading open-source Text-To-Video models, we conduct comprehensive evaluations using the public benchmark <a href="https://github.com/Vchitect/VBench">V-Bench</a>. Our evaluation specifically leverages the benchmark’s longer version prompt. For fair comparison with baseline models, we strictly follow their recommended setting for inference.
|
623 |
-
|
624 |
-
<p align="center">
|
625 |
-
<table align="center">
|
626 |
-
<thead>
|
627 |
-
<tr>
|
628 |
-
<th>Model</th>
|
629 |
-
<th>Total Score</th>
|
630 |
-
<th>Quality Score</th>
|
631 |
-
<th>Semantic Score</th>
|
632 |
-
</tr>
|
633 |
-
</thead>
|
634 |
-
<tbody>
|
635 |
-
<tr>
|
636 |
-
<td><a href="https://github.com/hpcaitech/Open-Sora">OpenSora 2.0</a></td>
|
637 |
-
<td>81.5 %</td>
|
638 |
-
<td>82.1 %</td>
|
639 |
-
<td>78.2 %</td>
|
640 |
-
</tr>
|
641 |
-
<tr>
|
642 |
-
<td><a href="https://github.com/THUDM/CogVideo">CogVideoX1.5-5B</a></td>
|
643 |
-
<td>80.3 %</td>
|
644 |
-
<td>80.9 %</td>
|
645 |
-
<td>77.9 %</td>
|
646 |
-
</tr>
|
647 |
-
<tr>
|
648 |
-
<td><a href="https://github.com/Tencent/HunyuanVideo">HunyuanVideo-13B</a></td>
|
649 |
-
<td>82.7 %</td>
|
650 |
-
<td>84.4 %</td>
|
651 |
-
<td>76.2 %</td>
|
652 |
-
</tr>
|
653 |
-
<tr>
|
654 |
-
<td><a href="https://github.com/Wan-Video/Wan2.1">Wan2.1-14B</a></td>
|
655 |
-
<td>83.7 %</td>
|
656 |
-
<td>84.2 %</td>
|
657 |
-
<td><strong>81.4 %</strong></td>
|
658 |
-
</tr>
|
659 |
-
<tr>
|
660 |
-
<td>SkyReels-V2</td>
|
661 |
-
<td><strong>83.9 %</strong></td>
|
662 |
-
<td><strong>84.7 %</strong></td>
|
663 |
-
<td>80.8 %</td>
|
664 |
-
</tr>
|
665 |
-
</tbody>
|
666 |
-
</table>
|
667 |
-
</p>
|
668 |
-
|
669 |
-
The VBench results demonstrate that SkyReels-V2 outperforms all compared models including HunyuanVideo-13B and Wan2.1-14B, With the highest **total score (83.9%)** and **quality score (84.7%)**. In this evaluation, the semantic score is slightly lower than Wan2.1-14B, while we outperform Wan2.1-14B in human evaluations, with the primary gap attributed to V-Bench’s insufficient evaluation of shot-scenario semantic adherence.
|
670 |
-
|
671 |
-
## Acknowledgements
|
672 |
-
We would like to thank the contributors of <a href="https://github.com/Wan-Video/Wan2.1">Wan 2.1</a>, <a href="https://github.com/xdit-project/xDiT">XDit</a> and <a href="https://qwenlm.github.io/blog/qwen2.5/">Qwen 2.5</a> repositories, for their open research and contributions.
|
673 |
-
|
674 |
-
## Citation
|
675 |
-
|
676 |
-
```bibtex
|
677 |
-
@misc{chen2025skyreelsv2infinitelengthfilmgenerative,
|
678 |
-
title={SkyReels-V2: Infinite-length Film Generative Model},
|
679 |
-
author={Guibin Chen and Dixuan Lin and Jiangping Yang and Chunze Lin and Juncheng Zhu and Mingyuan Fan and Hao Zhang and Sheng Chen and Zheng Chen and Chengchen Ma and Weiming Xiong and Wei Wang and Nuo Pang and Kang Kang and Zhiheng Xu and Yuzhe Jin and Yupeng Liang and Yubing Song and Peng Zhao and Boyuan Xu and Di Qiu and Debang Li and Zhengcong Fei and Yang Li and Yahui Zhou},
|
680 |
-
year={2025},
|
681 |
-
eprint={2504.13074},
|
682 |
-
archivePrefix={arXiv},
|
683 |
-
primaryClass={cs.CV},
|
684 |
-
url={https://arxiv.org/abs/2504.13074},
|
685 |
-
}
|
686 |
-
```
|
|
|
2 |
license: other
|
3 |
license_name: skywork-license
|
4 |
license_link: LICENSE
|
5 |
+
pipeline_tag: text-to-video, image-to-video
|
6 |
+
library_name: diffusers
|
7 |
---
|
8 |
+
|
9 |
<p align="center">
|
10 |
<img src="assets/logo2.png" alt="SkyReels Logo" width="50%">
|
11 |
</p>
|
|
|
276 |
--base_num_frames 97 \
|
277 |
--num_frames 257 \
|
278 |
--overlap_history 17 \
|
279 |
+
--prompt "A graceful white swan with a curved neck and delicate feathers swimming in a serene lake at dawn, its reflection perfectly mirrored in the still water as mist rises from the surface, with the swan occasionally dipping its head into the water to feed." \
|
280 |
--use_usp \
|
281 |
--offload \
|
282 |
--seed 42
|
|
|
605 |
<td>3.21</td>
|
606 |
<td>3.18</td>
|
607 |
<td>2.93</td>
|
608 |
+
</tr>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|