Shaobai Jiang
shaobaij
AI & ML interests
None yet
Recent Activity
upvoted a paper less than a minute ago
When Models Judge Themselves: Unsupervised Self-Evolution for Multimodal Reasoning upvoted a paper 1 minute ago
Rethinking Token-Level Policy Optimization for Multimodal Chain-of-Thought upvoted a paper about 3 hours ago
Reaching Beyond the Mode: RL for Distributional Reasoning in Language ModelsOrganizations
None yet