Mamba-In-Zephyr
Collection
Mamba distilled from Zephyr. The Mamba in the Llama: Distilling and Accelerating Hybrid Models (https://arxiv.org/abs/2408.15237).
•
6 items
•
Updated
This model is a fine-tuned version of JunxiongWang/mamba_0_875_sft on the HuggingFaceH4/ultrafeedback_binarized dataset.
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
@article{junxiongdaniele2024mambainllama,
title = {The Mamba in the Llama: Distilling and Accelerating Hybrid Models},
author = {Junxiong Wang and Daniele Paliotta and Avner May and Alexander M. Rush and Tri Dao},
journal = {arXiv preprint arXiv:2408.15237},
year = {2024}
}
Base model
JunxiongWang/mamba_0_875_sft