Token-Efficient Long Video Understanding for Multimodal LLMs Paper • 2503.04130 • Published 3 days ago • 61
DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks Paper • 2502.17157 • Published 13 days ago • 51
The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve? Paper • 2502.17535 • Published 13 days ago • 8
Slamming: Training a Speech Language Model on One GPU in a Day Paper • 2502.15814 • Published 18 days ago • 66
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment Paper • 2502.16894 • Published 13 days ago • 26