Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: openrail
|
3 |
+
datasets:
|
4 |
+
- JeanKaddour/minipile
|
5 |
+
- Open-Orca/OpenOrca
|
6 |
+
language:
|
7 |
+
- en
|
8 |
+
---
|
9 |
+
Micro Mistral
|
10 |
+
This is a small mistral model with 6 layers
|
11 |
+
|
12 |
+
It is similar to smol llama varaints uses GQA and tied embeddings. Except it uses mistral style arch with GQA and sliding window attention
|
13 |
+
|
14 |
+
This architecture takes GQA and tied embeddings to create an effeceint 0.5B model that uses the mistral architecture(It is supported in downstream applications)
|
15 |
+
|
16 |
+
Dataset
|
17 |
+
Minipile Instruct Math OpenOrca Synthetic Data
|
18 |
+
|
19 |
+
TODO: Complete Dataset section
|