Are you willing to test the model's ability to filter malicious behavior on AdvBench?
1
#3 opened 3 days ago
by
Byerose
Occasionally mix some English words into the generated text
10
#2 opened 5 months ago
by
kli017
Adding Evaluation Results
#1 opened 5 months ago
by
leaderboard-pr-bot
