Add Hugging Face paper link to model card
Browse filesThis PR improves the model card by adding a link to the Hugging Face paper page for better visibility and easier access to the research paper, alongside the existing arXiv link.
README.md
CHANGED
@@ -1,19 +1,19 @@
|
|
1 |
---
|
2 |
base_model: LGAI-EXAONE/EXAONE-4.0-32B
|
3 |
-
base_model_relation: quantized
|
4 |
-
license: other
|
5 |
-
license_name: exaone
|
6 |
-
license_link: LICENSE
|
7 |
language:
|
8 |
- en
|
9 |
- ko
|
10 |
- es
|
|
|
|
|
|
|
|
|
|
|
11 |
tags:
|
12 |
- lg-ai
|
13 |
- exaone
|
14 |
- exaone-4.0
|
15 |
-
|
16 |
-
library_name: transformers
|
17 |
---
|
18 |
|
19 |
<p align="center">
|
@@ -36,7 +36,7 @@ In the EXAONE 4.0 architecture, we apply new architectural changes compared to p
|
|
36 |
1. **Hybrid Attention**: For the 32B model, we adopt hybrid attention scheme, which combines *Local attention (sliding window attention)* with *Global attention (full attention)* in a 3:1 ratio. We do not use RoPE (Rotary Positional Embedding) for global attention for better global context understanding.
|
37 |
2. **QK-Reorder-Norm**: We adopt the Post-LN (LayerNorm) scheme for transformer blocks instead of Pre-LN, and we add RMS normalization right after the Q and K projection. It helps yield better performance on downstream tasks despite consuming more computation.
|
38 |
|
39 |
-
For more details, please refer to our [technical report](https://arxiv.org/abs/2507.11407), [blog](https://www.lgresearch.ai/blog/view?seq=576), and [GitHub](https://github.com/LG-AI-EXAONE/EXAONE-4.0).
|
40 |
|
41 |
|
42 |
### Model Configuration
|
@@ -836,9 +836,9 @@ The following tables show the evaluation results of each model, with reasoning a
|
|
836 |
<td >KMMLU-Redux</td>
|
837 |
<td align="center">46.9</td>
|
838 |
<td align="center">25.0</td>
|
839 |
-
<td align="center">
|
840 |
-
<td align="center">
|
841 |
-
<td align="center">
|
842 |
</tr>
|
843 |
<tr>
|
844 |
<td >KSM</td>
|
@@ -1142,4 +1142,4 @@ The model is licensed under [EXAONE AI Model License Agreement 1.2 - NC](./LICEN
|
|
1142 |
|
1143 |
## Contact
|
1144 |
|
1145 |
-
LG AI Research Technical Support: [email protected]
|
|
|
1 |
---
|
2 |
base_model: LGAI-EXAONE/EXAONE-4.0-32B
|
|
|
|
|
|
|
|
|
3 |
language:
|
4 |
- en
|
5 |
- ko
|
6 |
- es
|
7 |
+
library_name: transformers
|
8 |
+
license: other
|
9 |
+
license_name: exaone
|
10 |
+
license_link: LICENSE
|
11 |
+
pipeline_tag: text-generation
|
12 |
tags:
|
13 |
- lg-ai
|
14 |
- exaone
|
15 |
- exaone-4.0
|
16 |
+
base_model_relation: quantized
|
|
|
17 |
---
|
18 |
|
19 |
<p align="center">
|
|
|
36 |
1. **Hybrid Attention**: For the 32B model, we adopt hybrid attention scheme, which combines *Local attention (sliding window attention)* with *Global attention (full attention)* in a 3:1 ratio. We do not use RoPE (Rotary Positional Embedding) for global attention for better global context understanding.
|
37 |
2. **QK-Reorder-Norm**: We adopt the Post-LN (LayerNorm) scheme for transformer blocks instead of Pre-LN, and we add RMS normalization right after the Q and K projection. It helps yield better performance on downstream tasks despite consuming more computation.
|
38 |
|
39 |
+
For more details, please refer to our [technical report](https://arxiv.org/abs/2507.11407), [Hugging Face paper page](https://huggingface.co/papers/2507.11407), [blog](https://www.lgresearch.ai/blog/view?seq=576), and [GitHub](https://github.com/LG-AI-EXAONE/EXAONE-4.0).
|
40 |
|
41 |
|
42 |
### Model Configuration
|
|
|
836 |
<td >KMMLU-Redux</td>
|
837 |
<td align="center">46.9</td>
|
838 |
<td align="center">25.0</td>
|
839 |
+
<td align="center">19.4</td>
|
840 |
+
<td align="center">29.8</td>
|
841 |
+
<td align="center">26.4</td>
|
842 |
</tr>
|
843 |
<tr>
|
844 |
<td >KSM</td>
|
|
|
1142 |
|
1143 |
## Contact
|
1144 |
|
1145 |
+
LG AI Research Technical Support: [email protected]
|