Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders Paper • 2503.03601 • Published Mar 5 • 234
Token-Efficient Long Video Understanding for Multimodal LLMs Paper • 2503.04130 • Published Mar 6 • 95
view article Article State of open video generation models in Diffusers By sayakpaul and 2 others • Jan 27 • 53
view article Article SigLIP 2: A better multilingual vision language encoder By ariG23498 and 2 others • Feb 21 • 165
π_0: A Vision-Language-Action Flow Model for General Robot Control Paper • 2410.24164 • Published Oct 31, 2024 • 17
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages Paper • 2309.09400 • Published Sep 17, 2023 • 85