Update README.md
Browse files
README.md
CHANGED
|
@@ -8,4 +8,10 @@ library_name: adapter-transformers
|
|
| 8 |
---
|
| 9 |
---
|
| 10 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
--- sft 1700 llama3 test, 25 EPOCH
|
|
|
|
| 8 |
---
|
| 9 |
---
|
| 10 |
license: apache-2.0
|
| 11 |
+
--- Model Architecture Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
|
| 12 |
+
|
| 13 |
+
Training Data Params Context length GQA Token count Knowledge cutoff
|
| 14 |
+
Llama 3 A new mix of publicly available online data. 8B 8k Yes 15T+ March, 2023
|
| 15 |
+
70B 8k Yes December, 2023
|
| 16 |
+
Llama 3 family of models. Token counts refer to pretraining data only. Both the 8 and 70B versions use Grouped-Query Attention (GQA) for improved inference scalability.
|
| 17 |
--- sft 1700 llama3 test, 25 EPOCH
|