Submitted by henggg 213 The Landscape of Agentic Reinforcement Learning for LLMs: A Survey · 25 authors 883 3
Submitted by lovesnowbest 121 UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning · 106 authors 4
Submitted by SivilTaram 83 SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning · 7 authors 295 2
Submitted by taesiri 83 LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model · 7 authors 4.29k 1
Submitted by DongfuJiang 71 VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use TIGER-Lab 581 4
Submitted by hammh0a 57 Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic · 3 authors 1
Submitted by HLSv 54 ELV-Halluc: Benchmarking Semantic Aggregation Hallucinations in Long Video Understanding · 8 authors 8 1
Submitted by YuanLiuuuuuu 50 POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion · 11 authors 4
Submitted by rishiraj 42 Gated Associative Memory: A Parallel O(N) Architecture for Efficient Sequence Modeling · 1 authors 16 5
Submitted by fairyang 39 Baichuan-M2: Scaling Medical Capability with Large Verifier System · 34 authors 2
Submitted by Yanqing0327 33 OpenVision 2: A Family of Generative Pretrained Visual Encoders for Multimodal Learning · 7 authors 395 2
Submitted by dogtooth 25 Jointly Reinforcing Diversity and Quality in Language Model Generations · 8 authors 1
Submitted by Geaming 25 Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR · 8 authors 29 4
Submitted by Xiaoyu521 25 GenCompositor: Generative Video Compositing with Diffusion Transformer · 7 authors 120 4
Submitted by Andron00e 24 Benchmarking Optimizers for Large Language Model Pretraining · 3 authors 35 1
Submitted by nsjain 20 DynaGuard: A Dynamic Guardrail Model With User-Defined Policies · 10 authors 15 2
Submitted by ahnpersie 20 FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games · 7 authors 15 1
Submitted by orionweller 19 On the Theoretical Limitations of Embedding-Based Retrieval · 4 authors 1
Submitted by kwangju 14 Attributes as Textual Genes: Leveraging LLMs as Genetic Algorithm Simulators for Conditional Synthetic Data Generation · 3 authors 1
Submitted by che111 11 M3Ret: Unleashing Zero-shot Multimodal Medical Image Retrieval via Self-Supervision · 8 authors 1
Submitted by yulongchen 11 The Gold Medals in an Empty Room: Diagnosing Metalinguistic Reasoning in LLMs with Camlang · 6 authors 1
Submitted by amanchadha 9 SQL-of-Thought: Multi-agentic Text-to-SQL with Guided Error Correction · 3 authors 1
Submitted by zhangganlin 7 ViSTA-SLAM: Visual SLAM with Symmetric Two-view Association · 4 authors 142 1
Submitted by quandao10 6 Discrete Noise Inversion for Next-scale Autoregressive Text-based Image Editing · 9 authors 1
Submitted by fengerhu 6 MobiAgent: A Systematic Framework for Customizable Mobile Agents · 10 authors 286 2
Submitted by evanking 5 Flavors of Moonshine: Tiny Specialized ASR Models for Edge Devices · 5 authors 2.9k 1
Submitted by xianbao 5 Metis: Training Large Language Models with Advanced Low-Bit Quantization · 16 authors 1
Submitted by amanchadha 4 AMBEDKAR-A Multi-level Bias Elimination through a Decoding Approach with Knowledge Augmentation for Robust Constitutional Alignment of Language Models · 8 authors 1
Submitted by kenantang 4 Flaw or Artifact? Rethinking Prompt Sensitivity in Evaluating LLMs · 6 authors 1
Submitted by taesiri 2 MedDINOv3: How to adapt vision foundation models for medical image segmentation? · 5 authors 92 1
Submitted by taesiri 2 Improving Large Vision and Language Models by Learning from a Panel of Peers · 5 authors 1
Submitted by aHapBean 2 Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views · 3 authors 14 2
Submitted by theresiavr 2 Stairway to Fairness: Connecting Group and Individual Fairness · 5 authors 1 1
Submitted by zhengchong 2 FastFit: Accelerating Multi-Reference Virtual Try-On via Cacheable Diffusion Models · 10 authors 44 1
Submitted by Bekhouche 1 C-DiffDet+: Fusing Global Scene Context with Generative Denoising for High-Fidelity Object Detection · 6 authors 1