Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
yhuanghamu
/
Qwen2-0.5B-GRPO-test
like
0
Transformers
TensorBoard
Safetensors
AI-MO/NuminaMath-TIR
Generated from Trainer
trl
grpo
arxiv:
2402.03300
Model card
Files
Files and versions
Metrics
Training metrics
Community
Train
Deploy
Use this model
main
Qwen2-0.5B-GRPO-test
Commit History
End of training
982fd21
verified
yhuanghamu
commited on
about 1 month ago
Model save
4e950b3
verified
yhuanghamu
commited on
about 1 month ago
Training in progress, step 113
2dd5f9b
verified
yhuanghamu
commited on
about 1 month ago
Training in progress, step 110
14d1d65
verified
yhuanghamu
commited on
about 1 month ago
Training in progress, step 100
139d06f
verified
yhuanghamu
commited on
about 1 month ago
Training in progress, step 90
1f3ca54
verified
yhuanghamu
commited on
about 1 month ago
Training in progress, step 80
3428010
verified
yhuanghamu
commited on
about 1 month ago
Training in progress, step 70
75cce2a
verified
yhuanghamu
commited on
about 1 month ago
Training in progress, step 60
a67d04e
verified
yhuanghamu
commited on
about 1 month ago
Training in progress, step 50
6a7e9c5
verified
yhuanghamu
commited on
about 1 month ago
Training in progress, step 40
0762a12
verified
yhuanghamu
commited on
about 1 month ago
Training in progress, step 30
f33cd5b
verified
yhuanghamu
commited on
about 1 month ago
Training in progress, step 20
5dad765
verified
yhuanghamu
commited on
about 1 month ago
Training in progress, step 10
1031b43
verified
yhuanghamu
commited on
about 1 month ago
initial commit
c7bbaab
verified
yhuanghamu
commited on
about 1 month ago