Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
dongguanting 's Collections
ARPO
Tool-Star
RAG-Critic

ARPO

updated 23 days ago

The official datasets and model checkpoints of ARPO

Upvote
5

  • Agentic Reinforced Policy Optimization

    Paper • 2507.19849 • Published 26 days ago • 140

  • dongguanting/Qwen3-8B-ARPO-DeepSearch

    8B • Updated 24 days ago • 30 • 1

  • dongguanting/Qwen3-14B-ARPO-DeepSearch

    Text Generation • 15B • Updated 9 days ago • 60 • 4

  • dongguanting/Qwen2.5-7B-ARPO

    Text Generation • 8B • Updated 2 days ago • 67 • 2

  • dongguanting/Llama3.1-8B-ARPO

    Text Generation • 8B • Updated 9 days ago • 16 • 1

  • dongguanting/Qwen2.5-3B-ARPO

    Text Generation • 3B • Updated 9 days ago • 42 • 1

  • dongguanting/ARPO-SFT-54K

    Viewer • Updated 9 days ago • 54.6k • 692 • 8

  • dongguanting/ARPO-RL-Reasoning-10K

    Viewer • Updated 9 days ago • 10k • 320 • 2

  • dongguanting/ARPO-RL-DeepSearch-1K

    Viewer • Updated 23 days ago • 1.07k • 273 • 3
Upvote
5
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs