DPO Direct Preference Optimization: Your Language Model is Secretly a Reward Model Paper • 2305.18290 • Published May 29, 2023 • 48
Direct Preference Optimization: Your Language Model is Secretly a Reward Model Paper • 2305.18290 • Published May 29, 2023 • 48
Fewshot Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine Paper • 2311.16452 • Published Nov 28, 2023 • 2
Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine Paper • 2311.16452 • Published Nov 28, 2023 • 2