The collection for the Paper "SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild"
HKUST NLP Group
university
AI & ML interests
None defined yet.
Recent Activity
Collections
7
models
45
hkust-nlp/Llama-3.1-8B-SimpleRL-Zoo
Updated
•
2
hkust-nlp/Qwen-2.5-32B-SimpleRL-Zoo
Updated
•
10
hkust-nlp/Qwen-2.5-7B-SimpleRL-Zoo
Updated
•
5
hkust-nlp/DeepSeek-Math-7B-SimpleRL-Zoo
Updated
•
2
hkust-nlp/Mistral-7B-v0.1-SimpleRL-Zoo
Updated
•
2
hkust-nlp/Qwen-2.5-1.5B-SimpleRL-Zoo
Updated
•
102
hkust-nlp/Qwen-2.5-0.5B-SimpleRL-Zoo
Updated
•
1
hkust-nlp/Qwen-2.5-14B-SimpleRL-Zoo
Updated
•
1
hkust-nlp/Mistral-Small-24B-SimpleRL-Zoo
Updated
•
1
hkust-nlp/Qwen-2.5-Math-7B-SimpleRL-Zoo
Updated
•
1
datasets
22
hkust-nlp/SimpleRL-Zoo-Data
Viewer
•
Updated
•
53.1k
•
44
•
2
hkust-nlp/PreSelect-100B
Viewer
•
Updated
•
54.5M
•
1.38k
•
9
hkust-nlp/CodeIO-PyEdu-Reasoning
Preview
•
Updated
•
426
•
45
hkust-nlp/CodeIO-PyEdu-Reasoning-Raw
Updated
•
167
hkust-nlp/SynCSE-partial-NLI
Viewer
•
Updated
•
263k
•
87
•
2
hkust-nlp/SynCSE-scratch-NLI
Viewer
•
Updated
•
276k
•
107
•
2
hkust-nlp/gsm8k-fix
Viewer
•
Updated
•
7.47k
•
100
•
2
hkust-nlp/dart-math-uniform
Viewer
•
Updated
•
591k
•
120
•
9
hkust-nlp/vrt-baseline
Viewer
•
Updated
•
591k
•
62
•
1
hkust-nlp/dart-math-hard
Viewer
•
Updated
•
585k
•
141
•
13