nielsr HF Staff commited on
Commit
dc450d0
·
verified ·
1 Parent(s): dc25e09

Add Hugging Face paper link to model card

Browse files

This PR improves the model card by adding a link to the Hugging Face paper page for better visibility and easier access to the research paper, alongside the existing arXiv link.

Files changed (1) hide show
  1. README.md +11 -11
README.md CHANGED
@@ -1,19 +1,19 @@
1
  ---
2
  base_model: LGAI-EXAONE/EXAONE-4.0-32B
3
- base_model_relation: quantized
4
- license: other
5
- license_name: exaone
6
- license_link: LICENSE
7
  language:
8
  - en
9
  - ko
10
  - es
 
 
 
 
 
11
  tags:
12
  - lg-ai
13
  - exaone
14
  - exaone-4.0
15
- pipeline_tag: text-generation
16
- library_name: transformers
17
  ---
18
 
19
  <p align="center">
@@ -36,7 +36,7 @@ In the EXAONE 4.0 architecture, we apply new architectural changes compared to p
36
  1. **Hybrid Attention**: For the 32B model, we adopt hybrid attention scheme, which combines *Local attention (sliding window attention)* with *Global attention (full attention)* in a 3:1 ratio. We do not use RoPE (Rotary Positional Embedding) for global attention for better global context understanding.
37
  2. **QK-Reorder-Norm**: We adopt the Post-LN (LayerNorm) scheme for transformer blocks instead of Pre-LN, and we add RMS normalization right after the Q and K projection. It helps yield better performance on downstream tasks despite consuming more computation.
38
 
39
- For more details, please refer to our [technical report](https://arxiv.org/abs/2507.11407), [blog](https://www.lgresearch.ai/blog/view?seq=576), and [GitHub](https://github.com/LG-AI-EXAONE/EXAONE-4.0).
40
 
41
 
42
  ### Model Configuration
@@ -836,9 +836,9 @@ The following tables show the evaluation results of each model, with reasoning a
836
  <td >KMMLU-Redux</td>
837
  <td align="center">46.9</td>
838
  <td align="center">25.0</td>
839
- <td align="center">24.5</td>
840
- <td align="center">38.0</td>
841
- <td align="center">33.7</td>
842
  </tr>
843
  <tr>
844
  <td >KSM</td>
@@ -1142,4 +1142,4 @@ The model is licensed under [EXAONE AI Model License Agreement 1.2 - NC](./LICEN
1142
 
1143
  ## Contact
1144
 
1145
- LG AI Research Technical Support: [email protected]
 
1
  ---
2
  base_model: LGAI-EXAONE/EXAONE-4.0-32B
 
 
 
 
3
  language:
4
  - en
5
  - ko
6
  - es
7
+ library_name: transformers
8
+ license: other
9
+ license_name: exaone
10
+ license_link: LICENSE
11
+ pipeline_tag: text-generation
12
  tags:
13
  - lg-ai
14
  - exaone
15
  - exaone-4.0
16
+ base_model_relation: quantized
 
17
  ---
18
 
19
  <p align="center">
 
36
  1. **Hybrid Attention**: For the 32B model, we adopt hybrid attention scheme, which combines *Local attention (sliding window attention)* with *Global attention (full attention)* in a 3:1 ratio. We do not use RoPE (Rotary Positional Embedding) for global attention for better global context understanding.
37
  2. **QK-Reorder-Norm**: We adopt the Post-LN (LayerNorm) scheme for transformer blocks instead of Pre-LN, and we add RMS normalization right after the Q and K projection. It helps yield better performance on downstream tasks despite consuming more computation.
38
 
39
+ For more details, please refer to our [technical report](https://arxiv.org/abs/2507.11407), [Hugging Face paper page](https://huggingface.co/papers/2507.11407), [blog](https://www.lgresearch.ai/blog/view?seq=576), and [GitHub](https://github.com/LG-AI-EXAONE/EXAONE-4.0).
40
 
41
 
42
  ### Model Configuration
 
836
  <td >KMMLU-Redux</td>
837
  <td align="center">46.9</td>
838
  <td align="center">25.0</td>
839
+ <td align="center">19.4</td>
840
+ <td align="center">29.8</td>
841
+ <td align="center">26.4</td>
842
  </tr>
843
  <tr>
844
  <td >KSM</td>
 
1142
 
1143
  ## Contact
1144
 
1145
+ LG AI Research Technical Support: [email protected]