Models used in CHARM: Calibrating Reward Models With Chatbot Arena Scores.
shawnxzhu
shawnxzhu
·
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
3 days ago
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains
RLVR
liked
a dataset
3 months ago
TIGER-Lab/WebInstruct-verified
Organizations
None yet