DiffSkip-Llama-3-8B-Instruct
The implementation of the paper Differential Layer Skipping in Large Language Models.
Model Description
DiffSkip-Llama-3-8B-Instruct is an enhanced version of the Llama-3-8B-Instruct model, incorporating the Differential Layer Skipping (DiffSkip) method to enable dynamic Feed-Forward Network (FFN) skipping during text generation. This approach leverages the self-attention input-output difference as a routing signal, allowing tokens to bypass FFN blocks based on computational needs.
- Developed by: Xuan Luo, Weizhi Wang, Xifeng Yan
- Model type: Causal Language Model with dynamic FFN skipping
- Language(s) (NLP): English (en)
- License: Apache-2.0
- Finetuned from model: meta-llama/Meta-Llama-3-8B-Instruct
Model Card Contact
For questions or inquiries, please contact [email protected].
- Downloads last month
- 51
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for xuan-luo/DiffSkip-Llama-3-8B-Instruct
Base model
meta-llama/Meta-Llama-3-8B-Instruct