Update README.md
Browse files
README.md
CHANGED
@@ -26,7 +26,7 @@ library_name: transformers
|
|
26 |
We introduce Trillion-7B-preview, a preview of our latest large language model designed to push the boundaries of multilingual scalability and performance.
|
27 |
|
28 |
|
29 |
-
When comparing performance to training FLOPs for Trillion-7B-preview with competitive models, our model pushes the Pareto frontier, achieving 66.5% average performance while using significantly fewer compute (~9.3×10²² FLOPs). It outperforms models like Mistral-7B-Instruct-v0.3 and SOLAR-10.7B-Instruct-v1.0 while remaining competitive with models requiring 3-8× more compute such as Qwen2.5-7B-Instruct and EXAONE-3.5-7.8B-Instruct. For full benchmark results, see tables below.
|
30 |
|
31 |
<p align="center">
|
32 |
<img src="assets/frontier.png" alt="Average Performance vs. Approximate Training FLOPs" width="700">
|
|
|
26 |
We introduce Trillion-7B-preview, a preview of our latest large language model designed to push the boundaries of multilingual scalability and performance.
|
27 |
|
28 |
|
29 |
+
When comparing performance to training FLOPs for Trillion-7B-preview with competitive models, our model pushes the Pareto frontier, achieving around 66.5% average performance while using significantly fewer compute (~9.3×10²² FLOPs). It outperforms models like Mistral-7B-Instruct-v0.3 and SOLAR-10.7B-Instruct-v1.0 while remaining competitive with models requiring 3-8× more compute such as Qwen2.5-7B-Instruct and EXAONE-3.5-7.8B-Instruct. For full benchmark results, see tables below.
|
30 |
|
31 |
<p align="center">
|
32 |
<img src="assets/frontier.png" alt="Average Performance vs. Approximate Training FLOPs" width="700">
|