On Evaluating the Durability of Safeguards for Open-Weight LLMs Paper • 2412.07097 • Published Dec 10, 2024 • 1
Dynamic Risk Assessments for Offensive Cybersecurity Agents Paper • 2505.18384 • Published 14 days ago • 7
Dynamic Risk Assessments for Offensive Cybersecurity Agents Paper • 2505.18384 • Published 14 days ago • 7
SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks Paper • 2503.15478 • Published Mar 19 • 11
Offline Data Enhanced On-Policy Policy Gradient with Provable Guarantees Paper • 2311.08384 • Published Nov 14, 2023
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL Paper • 2402.19446 • Published Feb 29, 2024
Autonomous Evaluation and Refinement of Digital Agents Paper • 2404.06474 • Published Apr 9, 2024 • 2
$BT^2$: Backward-compatible Training with Basis Transformation Paper • 2211.03989 • Published Nov 8, 2022
DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning Paper • 2406.11896 • Published Jun 14, 2024 • 20
Aligning Large Language Models with Representation Editing: A Control Perspective Paper • 2406.05954 • Published Jun 10, 2024
Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning Paper • 2405.10292 • Published May 16, 2024 • 2
Automatic Evaluation of Attribution by Large Language Models Paper • 2305.06311 • Published May 10, 2023
eCeLLM: Generalizing Large Language Models for E-commerce from Large-scale, High-quality Instruction Data Paper • 2402.08831 • Published Feb 13, 2024