qiangpoz
qpz
AI & ML interests
None yet
Recent Activity
commented on
a paper
about 1 month ago
A Minimalist Approach to LLM Reasoning: from Rejection Sampling to
Reinforce
upvoted
a
paper
3 months ago
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
new activity
4 months ago
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B:Generate crashed by repeatedly generating <think>
Organizations
models
0
None public yet
datasets
0
None public yet