Center for AI Safety

non-profit

https://www.safe.ai

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

justinphan3110 submitted a paper about 1 month ago

Reducing Political Manipulation with Consistency Training

justinphan3110 updated a dataset 5 months ago

cais/hle-rolling

justinphan3110 new activity 6 months ago

cais/hle:adds eval.yaml

View all activity

Papers

Reducing Political Manipulation with Consistency Training

Humanity's Last Exam

View all Papers

Collections 2

spaces 1

TextQuests

📟

How Good are LLMs at Text-Based Video Games?

models 8

datasets 12

cais/hle-rolling

Viewer • Updated Feb 20 • 2.62k • 739 • 17

cais/hle

Benchmark • Updated Jan 20 • 2.5k • 30.2k • 857

cais/rli-public-set

Updated Nov 3, 2025 • 322 • 5

cais/rli-example-deliverables

Viewer • Updated Nov 1, 2025 • 176 • 71

cais/wmdp-cyber-forget-corpus

Viewer • Updated May 29, 2025 • 1k • 391 • 5

cais/wmdp-bio-forget-corpus

Viewer • Updated May 29, 2025 • 24.5k • 1.19k • 3

cais/MASK

Viewer • Updated Mar 20, 2025 • 1k • 4.91k • 14

cais/imagenet-o

Viewer • Updated May 27, 2024 • 2k • 143

cais/wmdp

Viewer • Updated Apr 27, 2024 • 3.67k • 25.8k • 29

cais/wmdp-mmlu-auxiliary-corpora

Viewer • Updated Apr 25, 2024 • 8.88k • 84 • 5

View 12 datasets