Fan Zhou's picture

Fan Zhou

koalazf99

·

https://koalazf99.github.io/

AI & ML interests

Deep Learning; Natural Language Processing; Foundation Models

Recent Activity

authored a paper about 13 hours ago

OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling

new activity about 19 hours ago

OctoThinker/MegaMath-Web-Pro-Max:[bot] Conversion to Parquet

liked a dataset 1 day ago

OctoThinker/MegaMath-Web-Pro-Max

View all activity

Organizations

upvoted a paper 1 day ago

OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling

Paper • 2506.20512 • Published 2 days ago • 30

upvoted a paper 8 days ago

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective

Paper • 2506.14965 • Published 10 days ago • 42

upvoted a paper 29 days ago

Thinking with Generated Images

Paper • 2505.22525 • Published about 1 month ago • 14

upvoted 5 papers about 1 month ago

ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows

Paper • 2505.19897 • Published May 26 • 102

Learn to Reason Efficiently with Adaptive Length-based Reward Shaping

Paper • 2505.15612 • Published May 21 • 33

Efficient Agent Training for Computer Use

Paper • 2505.13909 • Published May 20 • 44

Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis

Paper • 2505.13227 • Published May 19 • 45

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14 • 204

upvoted a paper 2 months ago

Generative AI Act II: Test Time Scaling Drives Cognition Engineering

Paper • 2504.13828 • Published Apr 18 • 17

upvoted 3 papers 3 months ago

MegaMath: Pushing the Limits of Open Math Corpora

Paper • 2504.02807 • Published Apr 3 • 31

Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme

Paper • 2504.02587 • Published Apr 3 • 30

SkyLadder: Better and Faster Pretraining via Context Window Scheduling

Paper • 2503.15450 • Published Mar 19 • 11

upvoted an article 4 months ago

Article

DualPipe could be better without the Dual

By

•

Feb 28

• 17

upvoted 2 papers 4 months ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 193

Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs

Paper • 2502.12982 • Published Feb 18 • 18

upvoted 3 papers 5 months ago

Teaching Language Models to Critique via Reinforcement Learning

Paper • 2502.03492 • Published Feb 5 • 24

CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction

Paper • 2502.07316 • Published Feb 11 • 50

LIMO: Less is More for Reasoning

Paper • 2502.03387 • Published Feb 5 • 61

upvoted a collection 5 months ago

Qwen2.5-VL

Vision-language model series based on Qwen2.5 • 11 items • Updated Apr 28 • 496

upvoted a paper 6 months ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8 • 280