MingLiiii commited on
Commit
15d12d8
1 Parent(s): c8d0ff2

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +84 -0
README.md ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: llama2
3
+ datasets:
4
+ - umd-zhou-lab/recycled_alpaca_v1
5
+ language:
6
+ - en
7
+ ---
8
+ # Model Card for umd-zhou-lab/recycled-alpaca-7b-v1.0
9
+
10
+ <!-- Provide a quick summary of what the model is/does. -->
11
+
12
+ This model is trained by fine-tuning llama-2 with recycled alpaca data V1.
13
+
14
+ ## Model Details
15
+
16
+ ### Model Description
17
+
18
+ <!-- Provide a longer summary of what this model is. -->
19
+
20
+
21
+ - **Developed by:** UMD Tianyi Zhou Lab
22
+ - **Model type:** An auto-regressive language model based on the transformer architecture
23
+ - **License:** Llama 2 Community License Agreement
24
+ - **Finetuned from model:** [meta-llama/Llama-2-7b](https://huggingface.co/meta-llama/Llama-2-7b)
25
+
26
+ ### Model Sources
27
+
28
+ <!-- Provide the basic links for the model. -->
29
+
30
+ - **GitHub:** [Reflection-Tuning](https://github.com/tianyi-lab/Reflection_Tuning)
31
+ - **Paper:** [Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning](https://arxiv.org/abs/2310.11716)
32
+ - **Data:** [recycled_alpaca_v1](https://huggingface.co/datasets/umd-zhou-lab/recycled_alpaca_v1)
33
+
34
+ ## Uses
35
+
36
+ The primary use of this model is research on large language models and chatbots.
37
+ The primary intended users of the model are researchers and hobbyists in natural language processing, machine learning, and artificial intelligence.
38
+
39
+ ## Training
40
+
41
+ We use the prompt from [FastChat](https://github.com/lm-sys/FastChat):
42
+
43
+ ```
44
+ A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: Hi ASSISTANT: Hello.</s>USER: Who are you? ASSISTANT: I am ...</s>......
45
+ ```
46
+
47
+ | Hyperparameter | Global Batch Size | Learning rate | Epochs | Max length | Weight decay | Warmup Rate |
48
+ | --- | ---: | ---: | ---: | ---: | ---: | ---: |
49
+ | Recycled Models (7B) | 128 | 2e-5 | 3 | 2048 | 0 | 0.03 |
50
+
51
+ ## Performance
52
+
53
+ The following table provides a comparison between our recycled models (V1) and baseline models on the AlpacaEval Leaderboard and Huggingface Open LLM Leaderboard. <br>
54
+
55
+ The Recycled Alpaca Data can be found here: [[hf-Link]](https://huggingface.co/datasets/umd-zhou-lab/recycled_alpaca_v1) <br>
56
+ The Recycled WizardLM (70k) Data can be found here: [[hf-Link]](https://huggingface.co/datasets/umd-zhou-lab/recycled_wiz70_v1) <br>
57
+
58
+ | | **AlpacaEval** || **Avg** | **ARC** | **HellaSwag** | **MMLU** | **TruthfulQA** || **Model**|
59
+ |--------------------------|:--------------:|:-:|:-----------:|:-------:|:-------------:|:-------:|:--------------:|:-:|:-:|
60
+ | **Alpaca 7B** | 26.46 || 50.21 | 42.65 | 76.91 | 41.73 | 39.55 ||/|
61
+ | **Recycled Alpaca 7B V1.0** | 76.99 || 56.18| 53.92 | 77.68 | 47.55 | 45.55 ||[[hf-Link]](https://huggingface.co/umd-zhou-lab/recycled-alpaca-7b-v1.0)|
62
+ | **Recycled Alpaca 13B V1.0** | 83.42 || 58.93| 58.70 | 80.80 | 53.11 | 43.12 ||[Link]|
63
+ |||||||||||
64
+ | **WizardLM 7B** | 67.64 || 54.18 | 51.60 | 77.70 | 42.70 | 44.70 ||/|
65
+ | **Recycled WizardLM 7B V1.0** | 78.88 || 56.21 | 53.92 | 77.05 | 48.35 | 45.52 ||[[hf-Link]](https://huggingface.co/umd-zhou-lab/recycled-wizardlm-7b-v1.0)|
66
+ |||||||||
67
+
68
+ ## Citation
69
+
70
+ Please consider citing our paper if you think our codes, data, or models are useful. Thank you!
71
+ ```
72
+ @misc{li2023reflectiontuning,
73
+ title={Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning},
74
+ author={Ming Li and Lichang Chen and Jiuhai Chen and Shwai He and Heng Huang and Jiuxiang Gu and Tianyi Zhou},
75
+ year={2023},
76
+ eprint={2310.11716},
77
+ archivePrefix={arXiv},
78
+ primaryClass={cs.CL}
79
+ }
80
+ ```
81
+
82
+
83
+
84
+