MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens Paper • 2404.03413 • Published Apr 4, 2024 • 28
RepVideo: Rethinking Cross-Layer Representation for Video Generation Paper • 2501.08994 • Published Jan 15 • 15