Update README.md
Browse files
README.md
CHANGED
@@ -14,7 +14,7 @@ tags:
|
|
14 |
|
15 |
# Nova-Nox-Neural-Network
|
16 |
|
17 |
-
All images used
|
18 |
|
19 |
Flash Technical Report (Japanese)
|
20 |
|
@@ -28,8 +28,6 @@ The architecture employs ASGG: Adaptive Swish-GELU Gating as the activation func
|
|
28 |
|
29 |
Furthermore, it utilizes DyT for normalization, which improves computational efficiency.
|
30 |
|
31 |
-
This repository provides a simplified implementation of N4, and it can be readily integrated with recent Attention-based architectures such as SWA, GQA, and MLA.
|
32 |
-
|
33 |
***
|
34 |
|
35 |
### Key Features
|
|
|
14 |
|
15 |
# Nova-Nox-Neural-Network
|
16 |
|
17 |
+
All images used are created by Rikka Botan.
|
18 |
|
19 |
Flash Technical Report (Japanese)
|
20 |
|
|
|
28 |
|
29 |
Furthermore, it utilizes DyT for normalization, which improves computational efficiency.
|
30 |
|
|
|
|
|
31 |
***
|
32 |
|
33 |
### Key Features
|