garyzhang
xiaoniqiu
ยท
AI & ML interests
LLM, Agents
Recent Activity
commented on
a paper
about 13 hours ago
On-Policy RL Meets Off-Policy Experts: Harmonizing Supervised
Fine-Tuning and Reinforcement Learning via Dynamic Weighting
commented on
a paper
1 day ago
On-Policy RL Meets Off-Policy Experts: Harmonizing Supervised
Fine-Tuning and Reinforcement Learning via Dynamic Weighting
Organizations
None yet