Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
PeterJinGo 's Collections
Search-R1-v0.3
Search-R1-v0.2
Search-R1

Search-R1-v0.3

updated 20 days ago

RL with outcome reward + format reward. https://arxiv.org/abs/2505.15117

Upvote
2

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-em-ppo-v0.3

    Updated 22 days ago • 134

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-it-em-ppo-v0.3

    Updated 22 days ago • 13

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-em-grpo-v0.3

    Updated 22 days ago • 85

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-it-em-grpo-v0.3

    Updated 22 days ago • 13

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-7b-em-ppo-v0.3

    Updated 22 days ago • 62

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-7b-em-grpo-v0.3

    Updated 22 days ago • 10

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-7b-it-em-grpo-v0.3

    Updated 22 days ago • 51 • 1

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-14b-em-ppo-v0.3

    Updated May 2 • 9

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-14b-em-grpo-v0.3

    Updated May 2 • 19

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-14b-it-em-grpo-v0.3

    Updated May 2 • 9

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-32b-em-grpo-v0.3

    Updated May 10 • 1.09k
Upvote
2
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs