ChatCounselor: A Large Language Models for Mental Health Support Paper • 2309.15461 • Published Sep 27, 2023
In Search of the Long-Tail: Systematic Generation of Long-Tail Knowledge via Logical Rule Guided Search Paper • 2311.07237 • Published Nov 13, 2023
A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents Paper • 2402.10196 • Published Feb 15, 2024
AttributionBench: How Hard is Automatic Attribution Evaluation? Paper • 2402.15089 • Published Feb 23, 2024
AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs Paper • 2404.07921 • Published Apr 11, 2024 • 2
Introducing v0.5 of the AI Safety Benchmark from MLCommons Paper • 2404.12241 • Published Apr 18, 2024 • 11
EIA: Environmental Injection Attack on Generalist Web Agents for Privacy Leakage Paper • 2409.11295 • Published Sep 17, 2024
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery Paper • 2410.05080 • Published Oct 7, 2024 • 21
AmpleGCG-Plus: A Strong Generative Model of Adversarial Suffixes to Jailbreak LLMs with Higher Success Rates in Fewer Attempts Paper • 2410.22143 • Published Oct 29, 2024
AdvWeb: Controllable Black-box Attacks on VLM-powered Web Agents Paper • 2410.17401 • Published Oct 22, 2024
RobustLR: Evaluating Robustness to Logical Perturbation in Deductive Reasoning Paper • 2205.12598 • Published May 25, 2022
RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments Paper • 2505.21936 • Published 10 days ago • 1
UGround Collection Navigating GUIs as Humans Do: Universal Visual Grounding for GUI Agents (ICLR'25 Oral) • 10 items • Updated May 4 • 6