DUS Forty Layer Merged Model

Overview

The DUS Forty Layer Merged Model leverages a unique layer interlocking strategy, combining layers from the Llama-2-13B and Mistral-7B architectures. This approach optimizes computational efficiency while maintaining competitive performance across various natural language processing tasks.

Model Details

  • Architecture: Based on Llama-2-13B and Mistral-7B
  • Layer Arrangement: The forty configuration merges layers from both models, interlocking layers 0โ€“20 with layers 12โ€“32.
  • Tokenizer: Mistral-7B tokenizer is used for encoding and decoding.

Training Details

Downloads last month
8
Safetensors
Model size
8.99B params
Tensor type
F16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support