Update README.md
Browse files
README.md
CHANGED
@@ -91,7 +91,7 @@ For more details, including benchmark evaluation, hardware requirements, and inf
|
|
91 |
| MMLU-Redux | 92.1 | **92.7** | 89.5 | 91.4 |
|
92 |
| GPQA | **82.8** | 71.1 | 65.8 | 73.4 |
|
93 |
| SuperGPQA | 57.8 | **60.7** | 51.8 | 56.8 |
|
94 |
-
| **Reasoning** | | | | |
|
95 |
| AIME25 | 72.0 | 81.5 | 70.9 | **85.0** |
|
96 |
| HMMT25 | 64.2 | 62.5 | 49.8 | **71.4** |
|
97 |
| LiveBench 20241125 | 74.3 | **77.1** | 74.3 | 76.8 |
|
@@ -281,4 +281,4 @@ If you find our work helpful, feel free to give us a cite.
|
|
281 |
primaryClass={cs.CL},
|
282 |
url={https://arxiv.org/abs/2505.09388},
|
283 |
}
|
284 |
-
```
|
|
|
91 |
| MMLU-Redux | 92.1 | **92.7** | 89.5 | 91.4 |
|
92 |
| GPQA | **82.8** | 71.1 | 65.8 | 73.4 |
|
93 |
| SuperGPQA | 57.8 | **60.7** | 51.8 | 56.8 |
|
94 |
+
| **Reasoning** | | | | |
|
95 |
| AIME25 | 72.0 | 81.5 | 70.9 | **85.0** |
|
96 |
| HMMT25 | 64.2 | 62.5 | 49.8 | **71.4** |
|
97 |
| LiveBench 20241125 | 74.3 | **77.1** | 74.3 | 76.8 |
|
|
|
281 |
primaryClass={cs.CL},
|
282 |
url={https://arxiv.org/abs/2505.09388},
|
283 |
}
|
284 |
+
```
|