Awesome RLHF
A curated collection of datasets, models, Spaces, and papers on Reinforcement Learning from Human Feedback (RLHF).
Running177📊Note A multi-turn evaluation benchmark for chatbots which uses GPT-4 as a judge to evaluate the quality of responses.
garage-bAInd/Open-Platypus
Viewer • Updated • 24.9k • 2.55k • 370Note A high quality blend of human and synthetic datasets focused on dialogues and reasoning abilities. A good source for SFT and PPO.
meta-llama/Llama-2-7b-chat-hf
Text Generation • Updated • 758k • • 3.98kNote The first series of open access models trained at scale using RLHF. Based on https://huggingface.co/papers/2307.09288
meta-llama/Llama-2-70b-chat-hf
Text Generation • Updated • 181k • 2.16kNote The first series of open access models trained at scale using RLHF. Based on https://huggingface.co/papers/2307.09288
lightonai/alfred-40b-0723
Text Generation • Updated • 26 • 45Note An RLHF tuned version of Falcon 40B
Anthropic/hh-rlhf
Viewer • Updated • 169k • 8.38k • 1.2kNote A dataset of dialogues between human annotators and a 52B parameter language model from Anthropic. Contains "helpfulness" and "harmlessness" subsets that can be used for training reward models. Basis for https://huggingface.co/papers/2112.00861 and https://huggingface.co/papers/2204.05862
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
Paper • 2204.05862 • Published • 2A General Language Assistant as a Laboratory for Alignment
Paper • 2112.00861 • Published • 2stanfordnlp/SHP
Viewer • Updated • 386k • 1.63k • 294openai/summarize_from_feedback
Viewer • Updated • 194k • 1.16k • 188openai/webgpt_comparisons
Viewer • Updated • 19.6k • 312 • 224