Category Error

#3
by CO-IR - opened

Regarding the benchmark metrics, HLE(humanity last exam) should belong to higher-order reasoning, not the code domain.

Sign up or log in to comment