66 28 33

Shenzhi Wang

shenzhi-wang

https://shenzhi-wang.netlify.app/

ShenzhiWang_THU

AI & ML interests

Large Language Model, Reinforcement Learning, and AI Agents

Recent Activity

upvoted a paper about 1 month ago

IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance

upvoted a paper about 1 month ago

Variational Reasoning for Language Models

upvoted a paper 2 months ago

T2R-bench: A Benchmark for Generating Article-Level Reports from Real World Industrial Tables

View all activity

Organizations

upvoted 2 papers about 1 month ago

IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance

Paper • 2509.26231 • Published Sep 30 • 17

Variational Reasoning for Language Models

Paper • 2509.22637 • Published Sep 26 • 68

upvoted a paper 2 months ago

T2R-bench: A Benchmark for Generating Article-Level Reports from Real World Industrial Tables

Paper • 2508.19813 • Published Aug 27 • 25

upvoted a paper 3 months ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 306

upvoted a paper 4 months ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 262

authored a paper 5 months ago

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 185

upvoted a paper 5 months ago

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 185

commented a paper 5 months ago

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 185 •

upvoted 3 papers 6 months ago

Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space

Paper • 2505.13308 • Published May 19 • 27

WorldPM: Scaling Human Preference Modeling

Paper • 2505.10527 • Published May 15 • 34

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14 • 308

authored a paper 6 months ago

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6 • 185

upvoted 2 papers 6 months ago

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6 • 185

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published Apr 18 • 135

authored 2 papers 7 months ago

COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values

Paper • 2504.05535 • Published Apr 7 • 44

COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values

Paper • 2504.05535 • Published Apr 7 • 44

upvoted a paper 7 months ago

COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values

Paper • 2504.05535 • Published Apr 7 • 44

upvoted 2 papers 8 months ago

4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models

Paper • 2503.10437 • Published Mar 13 • 32

ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation

Paper • 2502.18364 • Published Feb 25 • 37

updated a model 9 months ago

xwen-team/Xwen-0.5B-Chat

0.6B • Updated Feb 12 • 2

Shenzhi Wang

AI & ML interests

Recent Activity

Organizations

shenzhi-wang's activity