Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective
Fan Zhou
koalazf99
AI & ML interests
Deep Learning; Natural Language Processing; Foundation Models
Recent Activity
authored
a paper
about 8 hours ago
OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling
new activity
about 14 hours ago
OctoThinker/MegaMath-Web-Pro-Max:[bot] Conversion to Parquet
liked
a dataset
1 day ago
OctoThinker/MegaMath-Web-Pro-Max