AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning Paper • 2308.03526 • Published Aug 7, 2023 • 26
Simple synthetic data reduces sycophancy in large language models Paper • 2308.03958 • Published Aug 7, 2023 • 22