Fan Zhou's picture

Fan Zhou

koalazf99

·

https://koalazf99.github.io/

AI & ML interests

Deep Learning; Natural Language Processing; Foundation Models

Recent Activity

authored a paper about 13 hours ago

OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling

new activity about 19 hours ago

OctoThinker/MegaMath-Web-Pro-Max:[bot] Conversion to Parquet

liked a dataset 1 day ago

OctoThinker/MegaMath-Web-Pro-Max

View all activity

Organizations

authored a paper about 13 hours ago

OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling

Paper • 2506.20512 • Published 2 days ago • 30

New activity in OctoThinker/MegaMath-Web-Pro-Max about 19 hours ago

[bot] Conversion to Parquet

#3 opened about 22 hours ago by

parquet-converter

liked a dataset 1 day ago

OctoThinker/MegaMath-Web-Pro-Max

Viewer • Updated about 19 hours ago • 69.2M • 12

updated a collection 1 day ago

🐙 OctoThinker

Mid-training Incentivizes Reinforcement Learning Scaling • 18 items • Updated 1 day ago • 1

upvoted a paper 1 day ago

OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling

Paper • 2506.20512 • Published 2 days ago • 30

updated a collection 2 days ago

🐙 OctoThinker

Mid-training Incentivizes Reinforcement Learning Scaling • 18 items • Updated 1 day ago • 1

updated a collection 7 days ago

🧙 Guru

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective • 4 items • Updated 7 days ago

authored a paper 7 days ago

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective

Paper • 2506.14965 • Published 10 days ago • 42

upvoted a paper 8 days ago

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective

Paper • 2506.14965 • Published 10 days ago • 42

liked a dataset 17 days ago

princeton-nlp/SWE-bench_Verified

Viewer • Updated Feb 18 • 500 • 438k • 182

liked a dataset 25 days ago

LLM360/guru-RL-92k

Viewer • Updated about 6 hours ago • 91.9k • 428 • 14

upvoted a paper 29 days ago

Thinking with Generated Images

Paper • 2505.22525 • Published about 1 month ago • 14

upvoted a paper about 1 month ago

ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows

Paper • 2505.19897 • Published May 26 • 102

New activity in LLM360/MegaMath about 1 month ago

Megamath-code parquets do not contain text column

#6 opened about 2 months ago by

upvoted 2 papers about 1 month ago

Learn to Reason Efficiently with Adaptive Length-based Reward Shaping

Paper • 2505.15612 • Published May 21 • 33

Efficient Agent Training for Computer Use

Paper • 2505.13909 • Published May 20 • 44

liked a dataset about 1 month ago

xlangai/Jedi

Preview • Updated about 7 hours ago • 1.15k • 10