MatthieuZ commited on
Commit
50ec842
·
verified ·
1 Parent(s): 8a8e888

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -3
README.md CHANGED
@@ -1,3 +1,37 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Mixture of Attentions for Speculative Decoding
2
+
3
+ This is checkpoints obtained from "[Mixture of Attentions For Speculative Decoding](https://arxiv.org/abs/2410.03804)" by Matthieu Zimmer*, Milan Gritta*, Gerasimos Lampouras, Haitham Bou Ammar, and Jun Wang.
4
+ The paper introduces a novel architecture for speculative decoding that enhances the speed of large language model (LLM) inference.
5
+
6
+ It is supported in vLLM see our [Github repository](https://github.com/huawei-noah/HEBO/tree/mixture-of-attentions/).
7
+
8
+
9
+
10
+ ### Checkpoints
11
+
12
+ | Base Model | MOA Spec on Hugging Face | Base Model Parameters | MOA Spec Parameters |
13
+ |------|------|------|------|
14
+ | meta-llama/Meta-Llama-3-8B-Instruct | [huawei-noah/MOASpec-Llama-3-8B-Instruct](https://huggingface.co/huawei-noah/MOASpec-Llama-3-8B-Instruct) | 8B | 0.25B |
15
+
16
+ ## Citation
17
+
18
+ If you use this code or this checkpoint in your research, please cite our paper:
19
+
20
+ ```bibtex
21
+ @misc{zimmer2024mixtureattentionsspeculativedecoding,
22
+ title={Mixture of Attentions For Speculative Decoding},
23
+ author={Matthieu Zimmer and Milan Gritta and Gerasimos Lampouras and Haitham Bou Ammar and Jun Wang},
24
+ year={2024},
25
+ eprint={2410.03804},
26
+ archivePrefix={arXiv},
27
+ primaryClass={cs.CL},
28
+ url={https://arxiv.org/abs/2410.03804},
29
+ }
30
+ ```
31
+
32
+ ## License
33
+
34
+ This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for more details.
35
+
36
+ Disclaimer: This open source project is not an official Huawei product, Huawei is not expected to provide support for this project.
37
+