Submitted by henggg 101 The Landscape of Agentic Reinforcement Learning for LLMs: A Survey · 25 authors 155 2
Submitted by lovesnowbest 92 UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning · 106 authors 3
Submitted by SivilTaram 74 SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning · 7 authors 231 2
Submitted by taesiri 69 LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model · 7 authors 4.18k 1
Submitted by HLSv 52 ELV-Halluc: Benchmarking Semantic Aggregation Hallucinations in Long Video Understanding · 8 authors 7 1
Submitted by DongfuJiang 51 VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use · 12 authors 426 3
Submitted by YuanLiuuuuuu 42 POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion · 11 authors 3
Submitted by rishiraj 32 Gated Associative Memory: A Parallel O(N) Architecture for Efficient Sequence Modeling · 1 authors 7 4
Submitted by fairyang 31 Baichuan-M2: Scaling Medical Capability with Large Verifier System · 34 authors 2
Submitted by hammh0a 27 Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic · 3 authors 1
Submitted by dogtooth 21 Jointly Reinforcing Diversity and Quality in Language Model Generations · 8 authors 1
Submitted by Yanqing0327 21 OpenVision 2: A Family of Generative Pretrained Visual Encoders for Multimodal Learning · 7 authors 328 1
Submitted by Geaming 20 Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR · 8 authors 20 2
Submitted by Xiaoyu521 19 GenCompositor: Generative Video Compositing with Diffusion Transformer · 7 authors 52 4
Submitted by Andron00e 18 Benchmarking Optimizers for Large Language Model Pretraining · 3 authors 12 1
Submitted by nsjain 17 DynaGuard: A Dynamic Guardrail Model With User-Defined Policies · 10 authors 7 2
Submitted by ahnpersie 17 FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games · 7 authors 8 1
Submitted by kwangju 13 Attributes as Textual Genes: Leveraging LLMs as Genetic Algorithm Simulators for Conditional Synthetic Data Generation · 3 authors 1
Submitted by orionweller 13 On the Theoretical Limitations of Embedding-Based Retrieval · 4 authors 1
Submitted by che111 11 M3Ret: Unleashing Zero-shot Multimodal Medical Image Retrieval via Self-Supervision · 8 authors 1
Submitted by yulongchen 10 The Gold Medals in an Empty Room: Diagnosing Metalinguistic Reasoning in LLMs with Camlang · 6 authors 1
Submitted by zhangganlin 5 ViSTA-SLAM: Visual SLAM with Symmetric Two-view Association · 4 authors 58 1
Submitted by fengerhu 5 MobiAgent: A Systematic Framework for Customizable Mobile Agents · 10 authors 50 1
Submitted by quandao10 4 Discrete Noise Inversion for Next-scale Autoregressive Text-based Image Editing · 9 authors 1
Submitted by amanchadha 3 SQL-of-Thought: Multi-agentic Text-to-SQL with Guided Error Correction · 3 authors 1
Submitted by xianbao 3 Metis: Training Large Language Models with Advanced Low-Bit Quantization · 16 authors 1
Submitted by evanking 2 Flavors of Moonshine: Tiny Specialized ASR Models for Edge Devices · 5 authors 2.85k 1
Submitted by taesiri 2 MedDINOv3: How to adapt vision foundation models for medical image segmentation? · 5 authors 19 1
Submitted by amanchadha 2 AMBEDKAR-A Multi-level Bias Elimination through a Decoding Approach with Knowledge Augmentation for Robust Constitutional Alignment of Language Models · 8 authors 1
Submitted by kenantang 2 Flaw or Artifact? Rethinking Prompt Sensitivity in Evaluating LLMs · 6 authors 1
Submitted by taesiri 2 Improving Large Vision and Language Models by Learning from a Panel of Peers · 5 authors 1
Submitted by theresiavr 2 Stairway to Fairness: Connecting Group and Individual Fairness · 5 authors 1 1
Submitted by zhengchong 2 FastFit: Accelerating Multi-Reference Virtual Try-On via Cacheable Diffusion Models · 10 authors 27 1
Submitted by aHapBean 1 Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views · 3 authors 4 2
Submitted by Bekhouche 1 C-DiffDet+: Fusing Global Scene Context with Generative Denoising for High-Fidelity Object Detection · 6 authors 1