Papers
arxiv:2510.20822

HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives

Published on Oct 23
ยท Submitted by taesiri on Oct 24
Authors:
,
,
Yue Yu ,
,
,
,
,
,
,

Abstract

HoloCine generates coherent multi-shot narratives using a Window Cross-Attention mechanism and Sparse Inter-Shot Self-Attention, enabling end-to-end cinematic creation.

AI-generated summary

State-of-the-art text-to-video models excel at generating isolated clips but fall short of creating the coherent, multi-shot narratives, which are the essence of storytelling. We bridge this "narrative gap" with HoloCine, a model that generates entire scenes holistically to ensure global consistency from the first shot to the last. Our architecture achieves precise directorial control through a Window Cross-Attention mechanism that localizes text prompts to specific shots, while a Sparse Inter-Shot Self-Attention pattern (dense within shots but sparse between them) ensures the efficiency required for minute-scale generation. Beyond setting a new state-of-the-art in narrative coherence, HoloCine develops remarkable emergent abilities: a persistent memory for characters and scenes, and an intuitive grasp of cinematic techniques. Our work marks a pivotal shift from clip synthesis towards automated filmmaking, making end-to-end cinematic creation a tangible future. Our code is available at: https://holo-cine.github.io/.

Community

Paper submitter

HoloCine is a text-to-video framework that holistically generates coherent, cinematic multi-shot video narratives from a single prompt, combining Window Cross-Attention for per-shot control and Sparse Inter-Shot Self-Attention for efficient, consistent long-scene generation.

Paper author

Thanks a lot @taesiri for helping us submit our paper to the daily papers! ๐Ÿ™
Could we please use the following video as the cover to better showcase our results?
๐ŸŽฅ https://holo-cine.github.io/holocine.mp4

ยท

Congrats on the amazing work!

Unfortunately, it seems the media tag cannot be updated after an initial submission has been made. I think I messed that up, super sorry about that.

Paper author

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2510.20822 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2510.20822 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2510.20822 in a Space README.md to link it from this page.

Collections including this paper 2