4 7 177

Jian Hu

chuyi777

https://hujian.website

hijkzzz

AI & ML interests

Reinforcement Learning

Recent Activity

upvoted a paper 4 days ago

A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce

liked a dataset about 2 months ago

open-r1/OpenR1-Math-220k

liked a dataset about 2 months ago

open-thoughts/OpenThoughts-114k

View all activity

Organizations

chuyi777's activity

upvoted a paper 4 days ago

A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce

Paper • 2504.11343 • Published 5 days ago • 11

liked 2 datasets about 2 months ago

open-r1/OpenR1-Math-220k

Viewer • Updated Feb 18 • 450k • 37.6k • 555

open-thoughts/OpenThoughts-114k

Viewer • Updated 14 days ago • 228k • 30.3k • 689

liked a model about 2 months ago

qihoo360/TinyR1-32B-Preview

Text Generation • Updated 4 days ago • 3.02k • 327

upvoted a paper about 2 months ago

Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning

Paper • 2502.14768 • Published Feb 20 • 48

upvoted a paper 3 months ago

s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31 • 118

liked a model 3 months ago

CohereLabs/c4ai-command-r7b-12-2024

Text Generation • Updated 5 days ago • 7.26k • • 381

upvoted a paper 3 months ago

REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4 • 99

commented a paper 3 months ago

REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4 • 99 •

liked 2 datasets 4 months ago

AI-MO/NuminaMath-CoT

Viewer • Updated Nov 25, 2024 • 860k • 3.77k • 442

yingyingzhang/metamath-qwen2-math

Viewer • Updated Oct 1, 2024 • 467k • 171 • 32

upvoted a paper 4 months ago

ProcessBench: Identifying Process Errors in Mathematical Reasoning

Paper • 2412.06559 • Published Dec 9, 2024 • 83

updated 3 models 5 months ago

liked a model 6 months ago

O1-OPEN/OpenO1-LLama-8B-v0.1

Updated Oct 8, 2024 • 67 • 17

updated a model 6 months ago

OpenRLHF/Mistral-7b-PRM-Math-Shepherd

Updated Oct 30, 2024 • 10 • 1

New activity in OpenRLHF/Mistral-7b-PRM-Math-Shepherd 6 months ago

怎么下载模型呢？

#1 opened 6 months ago by

Yutong001

liked 2 models 6 months ago

AI-MO/NuminaMath-7B-TIR

Text Generation • Updated Aug 14, 2024 • 10.7k • 340

Nexusflow/Athene-70B

Text Generation • Updated Nov 15, 2024 • 3.12k • 197