HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal Paper • 2402.04249 • Published Feb 6, 2024 • 6
cais/HarmBench-Llama-2-13b-cls-multimodal-behaviors Text Generation • 13B • Updated Apr 11, 2024 • 8 •