arxiv:2501.16142
Yuandong Tian
tydsh
AI & ML interests
Reinforcement Learning, Optimization, Representation Learning
Recent Activity
authored
a paper
1 day ago
Towards General-Purpose Model-Free Reinforcement Learning
authored
a paper
6 days ago
Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary
Feedback
authored
a paper
about 2 months ago
Training Large Language Models to Reason in a Continuous Latent Space
Organizations
None yet
Papers
20
models
None public yet
datasets
None public yet