Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
zooblastlbz
/
id-align
like
0
Safetensors
llama
arxiv:
11 papers
Model card
Files
Files and versions
xet
Community
main
id-align
/
trl
/
trainer
255 kB
1 contributor
History:
1 commit
zooblastlbz
Upload folder using huggingface_hub
a9e1e1a
verified
3 months ago
__init__.py
Safe
1.47 kB
Upload folder using huggingface_hub
3 months ago
base.py
Safe
1.77 kB
Upload folder using huggingface_hub
3 months ago
ddpo_config.py
4.82 kB
Upload folder using huggingface_hub
3 months ago
ddpo_trainer.py
26.4 kB
Upload folder using huggingface_hub
3 months ago
dpo_trainer.py
61.4 kB
Upload folder using huggingface_hub
3 months ago
iterative_sft_trainer.py
16.2 kB
Upload folder using huggingface_hub
3 months ago
model_config.py
2.9 kB
Upload folder using huggingface_hub
3 months ago
ppo_config.py
8.14 kB
Upload folder using huggingface_hub
3 months ago
ppo_trainer.py
61.8 kB
Upload folder using huggingface_hub
3 months ago
reward_config.py
Safe
1.62 kB
Upload folder using huggingface_hub
3 months ago
reward_trainer.py
13.3 kB
Upload folder using huggingface_hub
3 months ago
sft_trainer.py
24.2 kB
Upload folder using huggingface_hub
3 months ago
utils.py
31.3 kB
Upload folder using huggingface_hub
3 months ago