Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2505.17612

about 22 hours ago

QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published 14 days ago • 84
Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published 14 days ago • 76
Qwen3 Technical Report

Paper • 2505.09388 • Published 23 days ago • 181
Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published about 1 month ago • 168

Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published 14 days ago • 76

LLMs Distillation

Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published 14 days ago • 76

Smol Agents papers

Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published 14 days ago • 76

Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published 14 days ago • 76
Synthetic Data RL: Task Definition Is All You Need

Paper • 2505.17063 • Published 19 days ago • 10

Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective

Paper • 2505.15045 • Published 16 days ago • 53
MMaDA: Multimodal Large Diffusion Language Models

Paper • 2505.15809 • Published 15 days ago • 85
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing

Paper • 2505.21600 • Published 9 days ago • 68
Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO

Paper • 2505.22453 • Published 8 days ago • 45

J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning

Paper • 2505.10320 • Published 22 days ago • 22
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures

Paper • 2505.09343 • Published 23 days ago • 63
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

Paper • 2505.10554 • Published 21 days ago • 118
Scaling Reasoning can Improve Factuality in Large Language Models

Paper • 2505.11140 • Published 21 days ago • 6

Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities

Paper • 2505.02567 • Published May 5 • 74
TabSTAR: A Foundation Tabular Model With Semantically Target-Aware Representations

Paper • 2505.18125 • Published 13 days ago • 109
Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published 14 days ago • 76
One RL to See Them All: Visual Triple Unified Reinforcement Learning

Paper • 2505.18129 • Published 13 days ago • 59

about 5 hours ago

CoRAG: Collaborative Retrieval-Augmented Generation

Paper • 2504.01883 • Published Apr 2 • 10
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

Paper • 2504.08837 • Published Apr 10 • 43
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

Paper • 2504.10068 • Published Apr 14 • 30
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published Apr 14 • 84

Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27 • 30
DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 128
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published Apr 18 • 127
Learning to Reason under Off-Policy Guidance

Paper • 2504.14945 • Published Apr 21 • 85

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs