Update README.md
Browse files
README.md
CHANGED
@@ -23,6 +23,9 @@ New base model, this one actually expects you to use Llama 3 Instruct format. Th
|
|
23 |
This is evolution 1. Yes, I know it makes no sense. I explain this in my rant down below. I'm going to list the recipe for the model now, but know the reality is more complex than just this:
|
24 |
|
25 |
|
|
|
|
|
|
|
26 |
Stock for the "True Merge" -- This was a TIES Merge, the reasoning is explained below for using TIES over Model Stock this time. Although, model stock was also used.
|
27 |
- PKU-Baichuan-MLSystemLab/Llama3-PBM-Nova-70B
|
28 |
- yentinglin/Llama-3-Taiwan-70B-Instruct
|
|
|
23 |
This is evolution 1. Yes, I know it makes no sense. I explain this in my rant down below. I'm going to list the recipe for the model now, but know the reality is more complex than just this:
|
24 |
|
25 |
|
26 |
+
# Model Architecture
|
27 |
+
This is a [stockmerge](https://arxiv.org/pdf/2403.19522) and [TIES](https://arxiv.org/abs/2306.01708) model. Thanks [mergekit](https://github.com/arcee-ai/mergekit) for making this really easy to do.
|
28 |
+
|
29 |
Stock for the "True Merge" -- This was a TIES Merge, the reasoning is explained below for using TIES over Model Stock this time. Although, model stock was also used.
|
30 |
- PKU-Baichuan-MLSystemLab/Llama3-PBM-Nova-70B
|
31 |
- yentinglin/Llama-3-Taiwan-70B-Instruct
|