Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
rkumar1999
/
Llama3.2-3B-Prover-openr1-distill-GRPO
like
0
Text Generation
Transformers
TensorBoard
Safetensors
rkumar1999/OBT-proof-short-60-n10000
llama
Generated from Trainer
open-r1
trl
grpo
conversational
text-generation-inference
arxiv:
2402.03300
Model card
Files
Files and versions
xet
Metrics
Training metrics
Community
Train
Deploy
Use this model
main
Llama3.2-3B-Prover-openr1-distill-GRPO
/
runs
43.1 kB
1 contributor
History:
1 commit
rkumar1999
Training in progress, epoch 0
59ed3e8
verified
25 days ago
Oct06_18-36-09_a5810eaa
Training in progress, epoch 0
25 days ago
Oct06_19-07-36_a5810eaa
Training in progress, epoch 0
25 days ago
Oct06_19-26-27_a5810eaa
Training in progress, epoch 0
25 days ago
Oct06_19-49-23_a5810eaa
Training in progress, epoch 0
25 days ago