AI & ML interests

Reinforcement Learning, Large Language Models, Value Alignment

Recent Activity

dayone3nder  updated a dataset 12 days ago
PKU-Alignment/self-monitor
dayone3nder  published a dataset 12 days ago
PKU-Alignment/self-monitor
Gaie  updated a collection about 1 month ago
Language Model Resist Alignment
View all activity