Commit
·
0806c70
1
Parent(s):
3cbee3d
Update README.md
Browse files
README.md
CHANGED
@@ -34,19 +34,19 @@ The EXAONE 4.0 model series consists of two sizes: a mid-size **32B** model opti
|
|
34 |
In the EXAONE 4.0 architecture, we apply new architectural changes compared to previous EXAONE models as below:
|
35 |
|
36 |
1. **Hybrid Attention**: For the 32B model, we adopt hybrid attention scheme, which combines *Local attention (sliding window attention)* with *Global attention (full attention)* in a 3:1 ratio. We do not use RoPE (Rotary Positional Embedding) for global attention for better global context understanding.
|
37 |
-
2. **QK-Reorder-Norm**: We
|
38 |
|
39 |
For more details, please refer to our [technical report](https://arxiv.org/abs/2507.11407), [blog](https://www.lgresearch.ai/blog/view?seq=576), and [GitHub](https://github.com/LG-AI-EXAONE/EXAONE-4.0).
|
40 |
|
41 |
|
42 |
### Model Configuration
|
43 |
|
44 |
-
- Number of Parameters (without embeddings):
|
45 |
-
- Number of Layers:
|
46 |
-
- Number of Attention Heads:
|
47 |
- Vocab Size: 102,400
|
48 |
-
- Context Length:
|
49 |
-
|
50 |
|
51 |
## Quickstart
|
52 |
|
|
|
34 |
In the EXAONE 4.0 architecture, we apply new architectural changes compared to previous EXAONE models as below:
|
35 |
|
36 |
1. **Hybrid Attention**: For the 32B model, we adopt hybrid attention scheme, which combines *Local attention (sliding window attention)* with *Global attention (full attention)* in a 3:1 ratio. We do not use RoPE (Rotary Positional Embedding) for global attention for better global context understanding.
|
37 |
+
2. **QK-Reorder-Norm**: We reorder the LayerNorm position from the traditional Pre-LN scheme by applying LayerNorm directly to the attention and MLP outputs, and we add RMS normalization right after the Q and K projection. It helps yield better performance on downstream tasks despite consuming more computation.
|
38 |
|
39 |
For more details, please refer to our [technical report](https://arxiv.org/abs/2507.11407), [blog](https://www.lgresearch.ai/blog/view?seq=576), and [GitHub](https://github.com/LG-AI-EXAONE/EXAONE-4.0).
|
40 |
|
41 |
|
42 |
### Model Configuration
|
43 |
|
44 |
+
- Number of Parameters (without embeddings): 1.07B
|
45 |
+
- Number of Layers: 30
|
46 |
+
- Number of Attention Heads: GQA with 32-heads and 8-KV heads
|
47 |
- Vocab Size: 102,400
|
48 |
+
- Context Length: 65,536 tokens
|
49 |
+
- Quantization: `Q8_0`, `Q6_K`, `Q5_K_M`, `Q4_K_M`, `IQ4_XS` in GGUF format (also includes `BF16` weights)
|
50 |
|
51 |
## Quickstart
|
52 |
|