Identifying Sensitive Weights via Post-quantization Integral Paper • 2503.01901 • Published 10 days ago • 7
Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities Paper • 2503.03983 • Published 4 days ago • 18
LINGOLY-TOO: Disentangling Memorisation from Reasoning with Linguistic Templatisation and Orthographic Obfuscation Paper • 2503.02972 • Published 5 days ago • 23
Token-Efficient Long Video Understanding for Multimodal LLMs Paper • 2503.04130 • Published 4 days ago • 64
LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM Paper • 2503.04724 • Published 3 days ago • 47
Remasking Discrete Diffusion Models with Inference-Time Scaling Paper • 2503.00307 • Published 9 days ago • 8
TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding Paper • 2502.19400 • Published 11 days ago • 42
MedVLM-R1: Incentivizing Medical Reasoning Capability of Vision-Language Models (VLMs) via Reinforcement Learning Paper • 2502.19634 • Published 11 days ago • 56
Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think Paper • 2502.20172 • Published 10 days ago • 26
UniTok: A Unified Tokenizer for Visual Generation and Understanding Paper • 2502.20321 • Published 10 days ago • 27
Mobius: Text to Seamless Looping Video Generation via Latent Shift Paper • 2502.20307 • Published 10 days ago • 16
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute Paper • 2502.20126 • Published 10 days ago • 19
OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference Paper • 2502.18411 • Published 12 days ago • 69
Project Alexandria: Towards Freeing Scientific Knowledge from Copyright Burdens via LLMs Paper • 2502.19413 • Published 11 days ago • 19
How far can we go with ImageNet for Text-to-Image generation? Paper • 2502.21318 • Published 9 days ago • 25
SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers Paper • 2502.20545 • Published 10 days ago • 20