12 17 28

Hanze Dong

hendrydong

https://hendrydong.github.io

hendrydong

AI & ML interests

None yet

Recent Activity

upvoted a paper 27 days ago

Chain-of-Experts: Unlocking the Communication Power of Mixture-of-Experts Models

liked a model about 1 month ago

Salesforce/E1-AceReason-14B

updated a model about 1 month ago

hendrydong/demonstration

View all activity

Organizations

upvoted a paper 27 days ago

Chain-of-Experts: Unlocking the Communication Power of Mixture-of-Experts Models

Paper • 2506.18945 • Published about 1 month ago • 39

upvoted 2 collections 2 months ago

Minimal-RL

Collection

2 items • Updated May 23 • 1

Elastic-Reasoning

Collection

5 items • Updated May 31 • 5

upvoted 4 papers 2 months ago

Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis

Paper • 2505.13227 • Published May 19 • 46

Fractured Chain-of-Thought Reasoning

Paper • 2505.12992 • Published May 19 • 22

Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

Paper • 2505.10554 • Published May 15 • 120

Scalable Chain of Thoughts via Elastic Reasoning

Paper • 2505.05315 • Published May 8 • 26

upvoted a paper 3 months ago

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17 • 92

upvoted 2 papers 5 months ago

Predictive Data Selection: The Data That Predicts Is the Data That Teaches

Paper • 2503.00808 • Published Mar 2 • 57

Self-rewarding correction for mathematical reasoning

Paper • 2502.19613 • Published Feb 26 • 84

upvoted an article 5 months ago

Article

Open R1: Update #2

and 6 others •

Feb 10

• 216

upvoted a paper 6 months ago

Reward-Guided Speculative Decoding for Efficient LLM Reasoning

Paper • 2501.19324 • Published Jan 31 • 40

upvoted a paper 7 months ago

Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published Dec 20, 2024 • 39

upvoted 4 papers about 1 year ago

RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment

Paper • 2304.06767 • Published Apr 13, 2023 • 2

LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models

Paper • 2306.12420 • Published Jun 21, 2023 • 2

RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13, 2024 • 72

Reverse Diffusion Monte Carlo

Paper • 2307.02037 • Published Jul 5, 2023 • 1

Hanze Dong

AI & ML interests

Recent Activity

Organizations

hendrydong's activity

Open R1: Update #2