Universal and Transferable Adversarial Attacks on Aligned Language Models
Paper
•
2307.15043
•
Published
•
2
Safety, Security and Privacy in Machine Learning (data poisoning, jailbreaks, and adversarial attacks)