File size: 862 Bytes
64fbae9
 
 
8854a09
 
 
 
 
 
 
 
 
 
 
 
 
64fbae9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
---
license: mit
---
# DUS Forty Layer Merged Model

## Overview
The DUS Forty Layer Merged Model leverages a unique layer interlocking strategy, combining layers from the Llama-2-13B and Mistral-7B architectures. This approach optimizes computational efficiency while maintaining competitive performance across various natural language processing tasks.

## Model Details
- **Architecture**: Based on Llama-2-13B and Mistral-7B
- **Layer Arrangement**: The `forty` configuration merges layers from both models, interlocking layers 0–20 with layers 12–32.
- **Tokenizer**: Mistral-7B tokenizer is used for encoding and decoding.

## Training Details
- **Base Models**: 
  - Llama-2-13B: [meta-llama/Llama-2-13b-hf](https://huggingface.co/meta-llama/Llama-2-13b-hf)
  - Mistral-7B: [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)