Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
Weird result in the leaderboard? Qwen
#10
by
cerisara
- opened
Thanks for the great work!
There's a weird result in the leaderboard: Qwen2.5-14b-instruct scores 42%, but Qwen2.5-Math-72b-instruct sores only 6% : OK, the leaderboard may not be focused on Maths, although there should be some maths questions in the datasets, but still, I find this weird: isn't it because of a scoring/format issue? I don't know, I may be wrong...
Can we have access to the dataset please? I'm sorry, I didn't found it...
Thanks!