-
Llama-3.1-FoundationAI-SecurityLLM-8B-Instruct Technical Report
Paper • 2508.01059 • Published • 33 -
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens
Paper • 2508.01191 • Published • 225 -
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification
Paper • 2508.05629 • Published • 166 -
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 162
Collections
Discover the best community collections!
Collections including paper arxiv:2508.10833
-
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning
Paper • 2402.15506 • Published • 17 -
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent
Paper • 2404.03648 • Published • 29 -
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts
Paper • 2405.19893 • Published • 32 -
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable
Paper • 2405.19888 • Published • 7
-
MobileGUI-RL: Advancing Mobile GUI Agent through Reinforcement Learning in Online Environment
Paper • 2507.05720 • Published • 2 -
GUI-G^2: Gaussian Reward Modeling for GUI Grounding
Paper • 2507.15846 • Published • 131 -
VeriGUI: Verifiable Long-Chain GUI Dataset
Paper • 2508.04026 • Published • 142 -
UI-Venus Technical Report: Building High-performance UI Agents with RFT
Paper • 2508.10833 • Published • 38
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 432 • 95 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 35 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 89
-
ReLearn: Unlearning via Learning for Large Language Models
Paper • 2502.11190 • Published • 30 -
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
Paper • 2502.11089 • Published • 165 -
Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents
Paper • 2502.11357 • Published • 10 -
DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding
Paper • 2503.12797 • Published • 32
-
UI-Venus Technical Report: Building High-performance UI Agents with RFT
Paper • 2508.10833 • Published • 38 -
inclusionAI/UI-Venus-Ground-7B
Image-Text-to-Text • 8B • Updated • 1.62k • 14 -
inclusionAI/UI-Venus-Ground-72B
Image-Text-to-Text • 73B • Updated • 236 • 9 -
inclusionAI/UI-Venus-Navi-7B
Image-Text-to-Text • 8B • Updated • 244 • 7
-
Gemini Robotics: Bringing AI into the Physical World
Paper • 2503.20020 • Published • 28 -
Magma: A Foundation Model for Multimodal AI Agents
Paper • 2502.13130 • Published • 58 -
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
Paper • 2311.05437 • Published • 51 -
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
Paper • 2410.23218 • Published • 51
-
Llama-3.1-FoundationAI-SecurityLLM-8B-Instruct Technical Report
Paper • 2508.01059 • Published • 33 -
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens
Paper • 2508.01191 • Published • 225 -
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification
Paper • 2508.05629 • Published • 166 -
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 162
-
UI-Venus Technical Report: Building High-performance UI Agents with RFT
Paper • 2508.10833 • Published • 38 -
inclusionAI/UI-Venus-Ground-7B
Image-Text-to-Text • 8B • Updated • 1.62k • 14 -
inclusionAI/UI-Venus-Ground-72B
Image-Text-to-Text • 73B • Updated • 236 • 9 -
inclusionAI/UI-Venus-Navi-7B
Image-Text-to-Text • 8B • Updated • 244 • 7
-
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning
Paper • 2402.15506 • Published • 17 -
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent
Paper • 2404.03648 • Published • 29 -
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts
Paper • 2405.19893 • Published • 32 -
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable
Paper • 2405.19888 • Published • 7
-
MobileGUI-RL: Advancing Mobile GUI Agent through Reinforcement Learning in Online Environment
Paper • 2507.05720 • Published • 2 -
GUI-G^2: Gaussian Reward Modeling for GUI Grounding
Paper • 2507.15846 • Published • 131 -
VeriGUI: Verifiable Long-Chain GUI Dataset
Paper • 2508.04026 • Published • 142 -
UI-Venus Technical Report: Building High-performance UI Agents with RFT
Paper • 2508.10833 • Published • 38
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 432 • 95 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 35 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 89
-
Gemini Robotics: Bringing AI into the Physical World
Paper • 2503.20020 • Published • 28 -
Magma: A Foundation Model for Multimodal AI Agents
Paper • 2502.13130 • Published • 58 -
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
Paper • 2311.05437 • Published • 51 -
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
Paper • 2410.23218 • Published • 51
-
ReLearn: Unlearning via Learning for Large Language Models
Paper • 2502.11190 • Published • 30 -
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
Paper • 2502.11089 • Published • 165 -
Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents
Paper • 2502.11357 • Published • 10 -
DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding
Paper • 2503.12797 • Published • 32