Submitted by xianbao 67 GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models · 171 authors 1
Submitted by RyanL22 33 Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Off · 2 authors 51 2
Submitted by SiriusL 16 InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy Optimization · 13 authors 7 1
Submitted by YerbaPage 12 Pruning the Unsurprising: Efficient Code Reasoning via First-Token Surprisal · 7 authors 3 2
Submitted by MikolajZ 6 GENIE: Gaussian Encoding for Neural Radiance Fields Interactive Editing · 4 authors 8 1
Submitted by hdong51 5 Adapting Vision-Language Models Without Labels: A Comprehensive Survey · 6 authors 16 1
Submitted by KejiaRobust 4 MELLA: Bridging Linguistic Capability and Cultural Groundedness for Low-Resource Language MLLMs · 7 authors 1
Submitted by fsk515 3 MeshLLM: Empowering Large Language Models to Progressively Understand and Generate 3D Mesh · 9 authors 1
Submitted by LianShuQuan 2 UI-AGILE: Advancing GUI Agents with Effective Reinforcement Learning and Precise Inference-Time Grounding · 7 authors 1
Submitted by thebluser 1 LightSwitch: Multi-view Relighting with Material-guided Diffusion · 3 authors 2
Submitted by huxueyu 1 OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use · 29 authors 325 1