Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Hacoo1234
/
Qwen2-0.5B-GRPO-SeqLenTest512
like
0
Transformers
TensorBoard
Safetensors
AI-MO/NuminaMath-TIR
Generated from Trainer
grpo
trl
arxiv:
2402.03300
Model card
Files
Files and versions
xet
Metrics
Training metrics
Community
Train
Deploy
Use this model
main
Qwen2-0.5B-GRPO-SeqLenTest512
/
runs
Ctrl+K
Ctrl+K
1 contributor
History:
12 commits
Hacoo1234
Model save
0a7d1c4
verified
22 days ago
Jul04_15-38-32_a60d940bb1f2
Training in progress, step 10
22 days ago
Jul04_15-50-09_a60d940bb1f2
Model save
22 days ago