Awesome RLHF

lewtun 's Collections

Awesome RLHF

Mistral 7B + UltraChat + Arithmo checkpoints

Hub tools

Gemma RLAIF

updated Oct 2, 2023

A curated collection of datasets, models, Spaces, and papers on Reinforcement Learning from Human Feedback (RLHF).

Upvote

Running

191

191

MT Bench

📊

Compare model answers to questions

Note A multi-turn evaluation benchmark for chatbots which uses GPT-4 as a judge to evaluate the quality of responses.
garage-bAInd/Open-Platypus

Viewer • Updated Jan 24, 2024 • 24.9k • 4.2k • 391

Note A high quality blend of human and synthetic datasets focused on dialogues and reasoning abilities. A good source for SFT and PPO.
meta-llama/Llama-2-7b-chat-hf

Text Generation • Updated Apr 17, 2024 • 1.14M • 4.45k

Note The first series of open access models trained at scale using RLHF. Based on https://huggingface.co/papers/2307.09288
meta-llama/Llama-2-70b-chat-hf

Text Generation • Updated Apr 17, 2024 • 62.7k • 2.19k

Note The first series of open access models trained at scale using RLHF. Based on https://huggingface.co/papers/2307.09288
lightonai/alfred-40b-0723

Text Generation • Updated Aug 11, 2023 • 64 • 46

Note An RLHF tuned version of Falcon 40B
Anthropic/hh-rlhf

Viewer • Updated May 26, 2023 • 169k • 10.6k • 1.35k

Note A dataset of dialogues between human annotators and a 52B parameter language model from Anthropic. Contains "helpfulness" and "harmlessness" subsets that can be used for training reward models. Basis for https://huggingface.co/papers/2112.00861 and https://huggingface.co/papers/2204.05862
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

Paper • 2204.05862 • Published Apr 12, 2022 • 2
A General Language Assistant as a Laboratory for Alignment

Paper • 2112.00861 • Published Dec 1, 2021 • 2
stanfordnlp/SHP

Viewer • Updated Oct 10, 2023 • 386k • 1.93k • 308
openai/summarize_from_feedback

Viewer • Updated Jan 3, 2023 • 194k • 1.31k • 207
openai/webgpt_comparisons

Viewer • Updated Dec 19, 2022 • 19.6k • 803 • 233

Upvote

Awesome RLHF

MT Bench