Citation

@article{shen2024longvu,
    title={LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding},
    author={Shen, Xiaoqian and Xiong, Yunyang and Zhao, Changsheng and Wu, Lemeng and Chen, Jun and Zhu, Chenchen and Liu, Zechun and Xiao, Fanyi and Varadarajan, Balakrishnan and Bordes, Florian and Liu, Zhuang and Xu, Hu and J. Kim, Hyunwoo and Soran, Bilge and Krishnamoorthi, Raghuraman and Elhoseiny, Mohamed and Chandra, Vikas},
    journal={arXiv:2410.17434},
    year={2024}
  }
Downloads last month
72
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Collection including Vision-CAIR/LongVU_Llama3_2_3B