TUNA: Comprehensive Fine-grained Temporal Understanding Evaluation on Dense Dynamic Videos Paper • 2505.20124 • Published May 26
STICKERCONV: Generating Multimodal Empathetic Responses from Scratch Paper • 2402.01679 • Published Jan 20, 2024 • 1
Modality Curation: Building Universal Embeddings for Advanced Multimodal Information Retrieval Paper • 2505.19650 • Published May 26 • 5