Recreating MMLU scores

#2
by theblackcat102 - opened

Do you guys use lm-evaluation-harnesss for MMLU evaluation? I'm not getting the stark improvement found in fineweb-edu image using this checkpoint.

FineData org

We do not. I've added a note to the top of this file detailing how you can reproduce our setup: https://huggingface.co/datasets/HuggingFaceFW/fineweb/blob/main/lighteval_tasks.py

@guipenedo Thanks for the quick reply, I will check it out

theblackcat102 changed discussion status to closed

Sign up or log in to comment