Yanjun Zhao
yanjunzhao97
ยท
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
9 days ago
RiskPO: Risk-based Policy Optimization via Verifiable Reward for LLM
Post-Training
upvoted
a
paper
12 days ago
Demystifying Reinforcement Learning in Agentic Reasoning
upvoted
a
paper
18 days ago
TaTToo: Tool-Grounded Thinking PRM for Test-Time Scaling in Tabular
Reasoning
Organizations
None yet