ArmBench-LLM / mmlu_pro_hy_results.csv
daniel7an
commit
4781b83
raw
history blame
126 Bytes
Model,Accuracy
claude-3-5-haiku-20241022,0.526
claude-3-5-sonnet-20241022,0.701
gemini-2.0-flash,0.741
gemini-1.5-flash,0.586