leftfooted
/

DUSFT-llm-model

Model card Files Files and versions

leftfooted commited on Jan 10

Commit

8854a09

·

verified ·

1 Parent(s): a7b76ca

Create README.md

Files changed (1) hide show

README.md +14 -0

README.md CHANGED Viewed

	@@ -0,0 +1,14 @@

+# DUS Forty Layer Merged Model
+## Overview
+The DUS Forty Layer Merged Model leverages a unique layer interlocking strategy, combining layers from the Llama-2-13B and Mistral-7B architectures. This approach optimizes computational efficiency while maintaining competitive performance across various natural language processing tasks.
+## Model Details
+- **Architecture**: Based on Llama-2-13B and Mistral-7B
+- **Layer Arrangement**: The `forty` configuration merges layers from both models, interlocking layers 0–20 with layers 12–32.
+- **Tokenizer**: Mistral-7B tokenizer is used for encoding and decoding.
+## Training Details
+- **Base Models**:
+  - Llama-2-13B: [meta-llama/Llama-2-13b-hf](https://huggingface.co/meta-llama/Llama-2-13b-hf)
+  - Mistral-7B: [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)