Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
14
17
19
Wei Xiong
weqweasdas
Follow
mzhaoshuai's profile picture
research4pan's profile picture
hendrydong's profile picture
17 followers
·
19 following
https://weixiongust.github.io/WeiXiongUST/index.html
AI & ML interests
Machine learning, RLHF
Recent Activity
updated
a dataset
about 8 hours ago
weqweasdas/numina_prompt_non_dedu
published
a dataset
about 8 hours ago
weqweasdas/numina_prompt_non_dedu
upvoted
a
paper
10 days ago
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
View all activity
Organizations
weqweasdas
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a model
5 months ago
RLHFlow/Llama3.1-8B-PRM-Deepseek-Data
Text Generation
•
Updated
May 10
•
21.9k
•
•
35
liked
a dataset
7 months ago
RLHFlow/RLHFlow-SFT-Dataset-ver2
Viewer
•
Updated
Nov 2, 2024
•
2.32M
•
73
•
5
liked
a model
7 months ago
RLHFlow/Llama3.1-8B-PRM-Mistral-Data
Text Generation
•
Updated
Nov 9, 2024
•
1.68k
•
•
10
liked
2 models
10 months ago
NCSOFT/Llama-3-OffsetBias-RM-8B
Text Classification
•
Updated
Sep 6, 2024
•
424
•
23
RLHFlow/LLaMA3-SFT
Text Generation
•
Updated
Nov 3, 2024
•
6.55k
•
10
liked
9 models
about 1 year ago
RLHFlow/LLaMA3-iterative-DPO-final
Text Generation
•
Updated
Oct 14, 2024
•
3.29k
•
41
RLHFlow/ArmoRM-Llama3-8B-v0.1
Text Classification
•
Updated
Sep 23, 2024
•
19.8k
•
178
RLHFlow/pair-preference-model-LLaMA3-8B
Text Generation
•
Updated
Oct 14, 2024
•
1.52k
•
38
Salesforce/LLaMA-3-8B-SFR-RM-R
Text Classification
•
Updated
Jan 21
•
14
•
11
Salesforce/LLaMA-3-8B-SFR-SFT-R
Text Generation
•
Updated
Jan 21
•
61
•
8
Salesforce/LLaMA-3-8B-SFR-Iterative-DPO-R
Text Generation
•
Updated
Jan 21
•
1.46k
•
77
sfairXC/FsfairX-LLaMA3-RM-v0.1
Text Classification
•
Updated
Oct 14, 2024
•
3.16k
•
59
sfairXC/FsfairX-Zephyr-Chat-v0.1
Text Generation
•
Updated
Apr 24, 2024
•
23
•
8
weqweasdas/RM-Mistral-7B
Text Classification
•
Updated
Mar 31, 2024
•
195
•
23
liked
a Space
about 1 year ago
Running
377
377
Reward Bench Leaderboard
📐
Display and filter reward model evaluation data
liked
2 models
over 1 year ago
weqweasdas/RM-Gemma-7B
Text Classification
•
Updated
Mar 22, 2024
•
47
•
8
weqweasdas/RM-Gemma-2B
Text Classification
•
Updated
Mar 22, 2024
•
750
•
25
liked
a model
almost 2 years ago
weqweasdas/hh_rlhf_rm_open_llama_3b
Text Classification
•
Updated
Feb 25, 2024
•
612
•
17
liked
a Space
about 2 years ago
Runtime error
66
66
Robin 7b
🔥