Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
ZhangQiao123
/
medical-model-grpo-16bit
like
1
Text Generation
Transformers
Safetensors
PyTorch
FreedomIntelligence/medical-o1-reasoning-SFT
Chinese
mistral
unsloth
medical
chinese
grpo
text-generation-inference
License:
apache-2.0
Model card
Files
Files and versions
Community
1
Train
Deploy
Use this model
请教一下奖励函数是如何设计的
#1
by
lhlhlsc
- opened
about 1 month ago
Discussion
lhlhlsc
about 1 month ago
如题
Edit
Preview
Upload images, audio, and videos by dragging in the text input, pasting, or
clicking here
.
Tap or paste here to upload images
Comment
·
Sign up
or
log in
to comment