Simon Yu's picture

Simon Yu PRO

simonycl

·

https://simonucl.github.io/

AI & ML interests

None yet

Recent Activity

updated a dataset about 1 month ago

simonycl/temp_file_2

published a dataset about 1 month ago

simonycl/temp_file_2

updated a model about 1 month ago

simonycl/Qwen3-4b-slime-no-variance-megatron

View all activity

Organizations

upvoted a collection 4 months ago

Verbalized Sampling

Dataset for the paper "Verbalized Sampling: Datasets for Mitigating Mode Collapse and Unlocking LLM Diversity" • 6 items • Updated Oct 31, 2025 • 6

upvoted 4 papers 5 months ago

QueST: Incentivizing LLMs to Generate Difficult Problems

Paper • 2510.17715 • Published Oct 20, 2025 • 35

Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity

Paper • 2510.01171 • Published Oct 1, 2025 • 19

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9, 2025 • 273

GEM: A Gym for Agentic LLMs

Paper • 2510.01051 • Published Oct 1, 2025 • 90

upvoted a paper 6 months ago

Language Models Can Learn from Verbal Feedback Without Scalar Rewards

Paper • 2509.22638 • Published Sep 26, 2025 • 70

upvoted 2 papers 9 months ago

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Paper • 2506.24119 • Published Jun 30, 2025 • 51

WHEN TO ACT, WHEN TO WAIT: Modeling Structural Trajectories for Intent Triggerability in Task-Oriented Dialogue

Paper • 2506.01881 • Published Jun 2, 2025 • 6

upvoted 2 papers 10 months ago

Reinforcing General Reasoning without Verifiers

Paper • 2505.21493 • Published May 27, 2025 • 26

Optimizing Anytime Reasoning via Budget Relative Policy Optimization

Paper • 2505.13438 • Published May 19, 2025 • 36

upvoted a paper 11 months ago

TextArena

Paper • 2504.11442 • Published Apr 15, 2025 • 30

upvoted a paper about 1 year ago

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

Paper • 2501.11873 • Published Jan 21, 2025 • 67