arxiv:2409.05283
Suyash Fulay
sfulay
AI & ML interests
NLP, CSS
Organizations
None yet
models
66
sfulay/zephyr-7b-dpo-full-gpt_consistent-reward-scale-05
7B
•
Updated
•
7
sfulay/zephyr-7b-dpo-full-prometheus-reward-scale-1-rpo
7B
•
Updated
•
11
sfulay/zephyr-7b-dpo-full-gpt-reward-scale-05
7B
•
Updated
•
9
sfulay/zephyr-7b-dpo-full-gpt_consistent-reward-scale-1-rpo-gamma-05
7B
•
Updated
•
6
sfulay/zephyr-7b-dpo-full-gpt-reward-scale-1-rpo
7B
•
Updated
•
3
sfulay/zephyr-7b-dpo-full-gpt_consistent-reward-scale-1-rpo-gamma-2
7B
•
Updated
•
5
sfulay/zephyr-7b-dpo-full-gpt_consistent-reward-scale-1-rpo
7B
•
Updated
•
12
sfulay/zephyr-7b-dpo-full-gpt-reward-scale-01
7B
•
Updated
•
5
sfulay/zephyr-7b-dpo-full-gpt-reward-scale-1
7B
•
Updated
•
5
sfulay/zephyr-7b-dpo-full-gpt-low-curriculum
7B
•
Updated
•
5
datasets
0
None public yet