shenzhi-wang commited on
Commit
87ec814
Β·
verified Β·
1 Parent(s): 3f7c064

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -2
README.md CHANGED
@@ -14,8 +14,6 @@ tags:
14
  ---
15
 
16
 
17
- This model is developed by [Shenzhi Wang](https://shenzhi-wang.netlify.app) (ηŽ‹ζ…Žζ‰§) and [Yaowei Zheng](https://github.com/hiyouga) (郑耀威).
18
-
19
  🌟 We included all instructions on how to download, use, and reproduce our various kinds of models at [this GitHub repo](https://github.com/Shenzhi-Wang/Llama3-Chinese-Chat). If you like our models, we would greatly appreciate it if you could star our Github repository. Additionally, please click "like" on our HuggingFace repositories. Thank you!
20
 
21
 
@@ -25,6 +23,19 @@ This model is developed by [Shenzhi Wang](https://shenzhi-wang.netlify.app) (ηŽ‹
25
  - πŸš€πŸš€πŸš€ [Apr. 29, 2024] We now introduce Llama3-8B-Chinese-Chat-**v2**! Compared to v1, the training dataset of v2 is **5 times larger** (~100K preference pairs), and it exhibits significant enhancements, especially in **roleplay**, **function calling**, and **math** capabilities! The training dataset of Llama3-8B-Chinese-Chat-v2 will be released soon. If you love our Llama3-8B-Chinese-Chat-v1, you won't want to miss out on Llama3-8B-Chinese-Chat-v2!
26
 
27
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
  # 1. Introduction
29
 
30
  ❗️❗️❗️NOTICE: The main branch contains the 8bit-quantized GGUF version of Llama3-8B-Chinese-Chat-**v2**, if you want to use our 8bit-quantized GGUF version of Llama3-8B-Chinese-Chat-**v1**, please refer to [the `v1` branch](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-8bit/tree/v1).
@@ -56,6 +67,9 @@ Training details:
56
  - optimizer: paged_adamw_32bit
57
 
58
 
 
 
 
59
  To reproduce Llama3-8B-Chinese-Chat-**v2** (to reproduce Llama3-8B-Chinese-Chat-**v1**, please refer to [this link](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat/blob/v1/README.md#1-introduction)):
60
 
61
  ```bash
@@ -97,6 +111,9 @@ deepspeed --num_gpus 8 src/train_bash.py \
97
  --optim paged_adamw_32bit
98
  ```
99
 
 
 
 
100
  # 2. Usage
101
 
102
  ```python
 
14
  ---
15
 
16
 
 
 
17
  🌟 We included all instructions on how to download, use, and reproduce our various kinds of models at [this GitHub repo](https://github.com/Shenzhi-Wang/Llama3-Chinese-Chat). If you like our models, we would greatly appreciate it if you could star our Github repository. Additionally, please click "like" on our HuggingFace repositories. Thank you!
18
 
19
 
 
23
  - πŸš€πŸš€πŸš€ [Apr. 29, 2024] We now introduce Llama3-8B-Chinese-Chat-**v2**! Compared to v1, the training dataset of v2 is **5 times larger** (~100K preference pairs), and it exhibits significant enhancements, especially in **roleplay**, **function calling**, and **math** capabilities! The training dataset of Llama3-8B-Chinese-Chat-v2 will be released soon. If you love our Llama3-8B-Chinese-Chat-v1, you won't want to miss out on Llama3-8B-Chinese-Chat-v2!
24
 
25
 
26
+
27
+ # Model Summary
28
+
29
+ Llama3-8B-Chinese-Chat is an instruction-tuned language model for Chinese & English users with various abilities such as roleplaying & tool-using built upon the Meta-Llama-3-8B-Instruct model.
30
+
31
+ Developed by: [Shenzhi Wang](https://shenzhi-wang.netlify.app) (ηŽ‹ζ…Žζ‰§) and [Yaowei Zheng](https://github.com/hiyouga) (郑耀威)
32
+
33
+ - License: [Llama-3 License](https://llama.meta.com/llama3/license/)
34
+ - Base Model: Meta-Llama-3-8B-Instruct
35
+ - Model Size: 8.03B
36
+ - Context length: 8K
37
+
38
+
39
  # 1. Introduction
40
 
41
  ❗️❗️❗️NOTICE: The main branch contains the 8bit-quantized GGUF version of Llama3-8B-Chinese-Chat-**v2**, if you want to use our 8bit-quantized GGUF version of Llama3-8B-Chinese-Chat-**v1**, please refer to [the `v1` branch](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-8bit/tree/v1).
 
67
  - optimizer: paged_adamw_32bit
68
 
69
 
70
+ <details>
71
+ <summary>To reproduce the model</summary>
72
+
73
  To reproduce Llama3-8B-Chinese-Chat-**v2** (to reproduce Llama3-8B-Chinese-Chat-**v1**, please refer to [this link](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat/blob/v1/README.md#1-introduction)):
74
 
75
  ```bash
 
111
  --optim paged_adamw_32bit
112
  ```
113
 
114
+ </details>
115
+
116
+
117
  # 2. Usage
118
 
119
  ```python