Submitted by fuvty 45 Cache-to-Cache: Direct Semantic Communication Between Large Language Models Tsinghua-NICS-EFC 17 2
Submitted by forde450 32 Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer inclusionAI 49 1
Submitted by taesiri 31 Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding Alpha-VLLM 720 1
Submitted by dcml0714 28 SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models · 10 authors 1
Submitted by zoeyuchao 27 RLinf-VLA: A Unified and Efficient Framework for VLA+RL Training RLinf 468 1
Submitted by taesiri 23 MATRIX: Mask Track Alignment for Interaction-aware Video Generation · 8 authors 19 1
Submitted by FSCCS 12 OBS-Diff: Accurate Pruning For Diffusion Models in One-Shot Westlake University 22 1
Submitted by ZetangForward 12 Revisiting Long-context Modeling from Context Denoising Perspective Soochow University 2
Submitted by whyu 10 Artificial Hippocampus Networks for Efficient Long-Context Modeling ByteDance Seed 13 1
Submitted by huggingaaaaa 10 Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention Tsinghua University 1 1
Submitted by XinXuNLPer 9 When Benchmarks Age: Temporal Misalignment through Large Language Model Factuality Evaluation McAuley-Lab 3 1
Submitted by MingyuLiu 9 StaMo: Unsupervised Learning of Generalizable Robot Motion from Compact State Representation Zhejiang University 2
Submitted by Chenfei-Liao 8 Are We Using the Right Benchmark: An Evaluation Framework for Visual Token Compression Methods · 13 authors 1
Submitted by JimmyMa99 7 Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs · 14 authors 12 1
Submitted by XuWuLingYu 5 WristWorld: Generating Wrist-Views via 4D World Models for Robotic Manipulation Peking University 1
Submitted by taesiri 4 TTRV: Test-Time Reinforcement Learning for Vision Language Models · 10 authors 4 1
Submitted by taesiri 3 AlphaApollo: Orchestrating Foundation Models and Professional Tools into a Self-Evolving System for Deep Agentic Reasoning · 17 authors 5 1
Submitted by myownskyW7 3 G^2RPO: Granular GRPO for Precise Reward in Flow Models IXCLab@Shanghai AI Lab 15 1
Submitted by RajveeSheth 2 Beyond Monolingual Assumptions: A Survey of Code-Switched NLP in the Era of Large Language Models Lingo Research Group 1 1
Submitted by imsheriff 2 The African Languages Lab: A Collaborative Approach to Advancing Low-Resource African NLP University of California, Los Angeles 1
Submitted by taesiri 1 U-Bench: A Comprehensive Understanding of U-Net through 100-Variant Benchmarking · 10 authors 22 1
Submitted by Yanran21 1 D^3QE: Learning Discrete Distribution Discrepancy-aware Quantization Error for Autoregressive-Generated Image Detection · 8 authors 1 1
Submitted by yasNing 1 DeepTravel: An End-to-End Agentic Reinforcement Learning Framework for Autonomous Travel Planning Agents Didi Chuxing 1