DPO vs KTO vs IPO - a alignment-handbook Collection

alignment-handbook 's Collections

Handbook v0.1 models and datasets

DPO vs KTO vs IPO

Constitutional AI

DPO vs KTO vs IPO

updated Jan 16, 2024

A collection of datasets and models used for the Aligning LLMs with Direct Preference Optimization Methods blogpost

HuggingFaceH4/orca_dpo_pairs

Viewer • Updated Apr 14, 2024 • 12.9k • 144 • 29
HuggingFaceH4/ultrafeedback_binarized

Viewer • Updated Oct 16, 2024 • 187k • 11.1k • 297