metadata
license: mit
DUS Forty Layer Merged Model
Overview
The DUS Forty Layer Merged Model leverages a unique layer interlocking strategy, combining layers from the Llama-2-13B and Mistral-7B architectures. This approach optimizes computational efficiency while maintaining competitive performance across various natural language processing tasks.
Model Details
- Architecture: Based on Llama-2-13B and Mistral-7B
- Layer Arrangement: The
forty
configuration merges layers from both models, interlocking layers 0–20 with layers 12–32. - Tokenizer: Mistral-7B tokenizer is used for encoding and decoding.
Training Details
- Base Models:
- Llama-2-13B: meta-llama/Llama-2-13b-hf
- Mistral-7B: mistralai/Mistral-7B-v0.1