arxiv:2510.00553
xuxin
xx18
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
22 days ago
Training-Free Group Relative Policy Optimization
authored
a paper
23 days ago
S$^3$c-Math: Spontaneous Step-level Self-correction Makes Large Language
Models Better Mathematical Reasoners
authored
a paper
23 days ago
VerifyBench: Benchmarking Reference-based Reward Systems for Large
Language Models