Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
vectorzhou
/
vectorzhou-Qwen2-5-1-5B-Instruct-SFT-OpenHerm-ction-v0-1-OnlineIPO1-lora-0603201242-epoch-1
like
0
Text Generation
Transformers
Safetensors
OpenRLHF/prompt-collection-v0.1
Generated from Trainer
fine-tuned
trl
extra-gradient
conversational
arxiv:
2503.08942
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
main
vectorzhou-Qwen2-5-1-5B-Instruct-SFT-OpenHerm-ction-v0-1-OnlineIPO1-lora-0603201242-epoch-1
/
vocab.json
vectorzhou
Epoch 1 checkpoint
7e3ab73
verified
7 days ago
raw
Copy download link
history
contribute
delete
Safe
3.38 MB
File too large to display, you can
check the raw version
instead.