Negative Token Merging: Image-based Adversarial Feature Guidance Paper • 2412.01339 • Published Dec 2, 2024 • 23
SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020) Paper • 2006.07235 • Published Jun 12, 2020
Summon a Demon and Bind it: A Grounded Theory of LLM Red Teaming in the Wild Paper • 2311.06237 • Published Nov 10, 2023 • 1
Surveying (Dis)Parities and Concerns of Compute Hungry NLP Research Paper • 2306.16900 • Published Jun 29, 2023
Efficient Methods for Natural Language Processing: A Survey Paper • 2209.00099 • Published Aug 31, 2022 • 1
garak: A Framework for Security Probing Large Language Models Paper • 2406.11036 • Published Jun 16, 2024 • 1
Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models Paper • 2406.09403 • Published Jun 13, 2024 • 21
Introducing v0.5 of the AI Safety Benchmark from MLCommons Paper • 2404.12241 • Published Apr 18, 2024 • 11
Instruction-tuned Language Models are Better Knowledge Learners Paper • 2402.12847 • Published Feb 20, 2024 • 26
Detecting Pretraining Data from Large Language Models Paper • 2310.16789 • Published Oct 25, 2023 • 11
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection Paper • 2310.11511 • Published Oct 17, 2023 • 76