aisi-whitebox/inspect_llama_31_8b_instruct_prompted_sandbagging_mmlu_0_shot_unfiltered
Viewer
•
Updated
•
1k
•
7
Llama 3.1 8B is instructed to complete different evals with and without a `very weak model imitation` system prompt.