Vinnnf commited on
Commit
1e4ced4
·
verified ·
1 Parent(s): 5d28030

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -0
README.md CHANGED
@@ -4,3 +4,32 @@ base_model:
4
  library_name: transformers
5
  ---
6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  library_name: transformers
5
  ---
6
 
7
+ # MaskLLM: Learnable Semi-structured Sparsity for Large Language Models
8
+
9
+ <div align="center">
10
+ <figure>
11
+ <img src="https://github.com/NVlabs/MaskLLM/blob/main/assets/teaser.png?raw=true" style="width:70%; display:block; margin-left:auto; margin-right:auto;"
12
+ </figure>
13
+ </div>
14
+
15
+ This work introduces [MaskLLM](https://github.com/NVlabs/MaskLLM), a **learnable** pruning method that establishes **Semi-structured (or ``N:M'') Sparsity** in LLMs, aimed at reducing computational overhead during inference. The proposed method is scalable and stands to benefit from larger training datasets.
16
+
17
+ ## Requirements
18
+ We provide pre-computed masks for Huggingface Models such as Llama-2 7B and Llama-3 8B with the minimum requirements. It will not involve docker, Megatron or data preprocessing.
19
+ ```bash
20
+ pip install transformers accelerate datasets SentencePiece
21
+ ```
22
+
23
+ ## Pre-computed Masks
24
+
25
+ The following masks were trained and provided by [@VainF](https://github.com/VainF). We use ``huggingface_hub`` to automatically download those masks and apply them to offcical LLMs for evaluation. Those mask files were compressed using [numpy.savez_compressed](tool_compress_mask.py). More results for baselines (SparseGPT, Wanda) can be found in the appendix.
26
+ | Model | Pattern | Training Data | Training/Eval SeqLen | PPL (Dense) | PPL (Sparse) | Link |
27
+ | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
28
+ | LLaMA-2 7B | 2:4 | C4 (2B Tokens)| 4096 | 5.12 | 6.78 | [HuggingFace](https://huggingface.co/Vinnnf/LLaMA-2-7B-MaskLLM-C4) |
29
+ | LLaMA-3 8B | 2:4 | C4 (2B Tokens) | 4096 | 5.75 | 8.49 | [HuggingFace]() |
30
+ | LLaMA-3.1 8B | 2:4 | C4 (2B Tokens) | 4096 | - | Comming Soon |
31
+
32
+
33
+ ## How to use it
34
+
35
+ Please see [NVlabs/MaskLLM](https://github.com/NVlabs/MaskLLM).