Vamba
Collection
Video Mamba
โข
2 items
โข
Updated
This repo contains model checkpoints for Vamba-Qwen2-VL-7B. Vamba is a hybrid Mamba-Transformer model that leverages cross-attention layers and Mamba-2 blocks for efficient hour-long video understanding.
๐ Homepage | ๐ arXiv | ๐ป GitHub | ๐ค Model
If you find our paper useful, please cite us with
@misc{ren2025vambaunderstandinghourlongvideos,
title={Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers},
author={Weiming Ren and Wentao Ma and Huan Yang and Cong Wei and Ge Zhang and Wenhu Chen},
year={2025},
eprint={2503.11579},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2503.11579},
}