---
license: openrail++
datasets:
- krahets/dna_rendering_processed
base_model:
- stabilityai/stable-diffusion-2-1-base
pipeline_tag: video-to-video
tags:
- 3d-generation
- 4d-generation
- human
- avatar
- multi-view video
---

# Diffuman4D Model

[**Project Page**](https://diffuman4d.github.io/) | [**Paper**](https://arxiv.org/abs/2507.13344) | [**Code**](https://github.com/zju3dv/Diffuman4D) |  [**Dataset**](https://huggingface.co/datasets/krahets/dna_rendering_processed)

> The official model repo for Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models.

<img src="assets/images/teaser_dna.gif" width="100%" alt="teaser">

Diffuman4D enables high-fidelity free-viewpoint rendering of human performances from sparse-view videos.

## Usage

See the [GitHub repo](https://github.com/zju3dv/Diffuman4D) for detailed usage.

## Cite

```
@inproceedings{jin2025diffuman4d,
  title={Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models},
  author={Jin, Yudong and Peng, Sida and Wang, Xuan and Xie, Tao and Xu, Zhen and Yang, Yifan and Shen, Yujun and Bao, Hujun and Zhou, Xiaowei},
  booktitle={International Conference on Computer Vision (ICCV)},
  year={2025}
}
```