CulturalBench Collection A Robust, Diverse and Challegning Benchmark for Measuring Cultural Knowledge of LLMs • 6 items • Updated about 4 hours ago
Will AI Tell Lies to Save Sick Children? Litmus-Testing AI Values Prioritization with AIRiskDilemmas Paper • 2505.14633 • Published 17 days ago • 3
Will AI Tell Lies to Save Sick Children? Litmus-Testing AI Values Prioritization with AIRiskDilemmas Paper • 2505.14633 • Published 17 days ago • 3 • 2
CulturalBench: a Robust, Diverse and Challenging Benchmark on Measuring the (Lack of) Cultural Knowledge of LLMs Paper • 2410.02677 • Published Oct 3, 2024
CulturalBench Collection A Robust, Diverse and Challegning Benchmark for Measuring Cultural Knowledge of LLMs • 6 items • Updated about 4 hours ago