Speaking Beyond Language: A Large-Scale Multimodal Dataset for Learning Nonverbal Cues from Video-Grounded Dialogues Paper • 2506.00958 • Published 11 days ago • 20
Revisiting Residual Connections: Orthogonal Updates for Stable and Efficient Deep Networks Paper • 2505.11881 • Published 26 days ago • 4
VisEscape: A Benchmark for Evaluating Exploration-driven Decision-making in Virtual Escape Rooms Paper • 2503.14427 • Published Mar 18 • 19
How Far is Video Generation from World Model: A Physical Law Perspective Paper • 2411.02385 • Published Nov 4, 2024 • 36
Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders Paper • 2410.22366 • Published Oct 28, 2024 • 83
Running on CPU Upgrade 13.2k 13.2k Open LLM Leaderboard 🏆 Track, rank and evaluate open LLMs and chatbots