view article Article Illustrating Reinforcement Learning from Human Feedback (RLHF) By natolambert and 3 others โข Dec 9, 2022 โข 292
Running 10 10 GradientCuff-Jailbreak-Defense ๐ก Demonstration of Gradient Cuff: A Jailbreak Defense