Update README.md
Browse files
README.md
CHANGED
@@ -14,8 +14,6 @@ tags:
|
|
14 |
---
|
15 |
|
16 |
|
17 |
-
This model is developed by [Shenzhi Wang](https://shenzhi-wang.netlify.app) (ηζ
ζ§) and [Yaowei Zheng](https://github.com/hiyouga) (ιθε¨).
|
18 |
-
|
19 |
π We included all instructions on how to download, use, and reproduce our various kinds of models at [this GitHub repo](https://github.com/Shenzhi-Wang/Llama3-Chinese-Chat). If you like our models, we would greatly appreciate it if you could star our Github repository. Additionally, please click "like" on our HuggingFace repositories. Thank you!
|
20 |
|
21 |
|
@@ -25,6 +23,19 @@ This model is developed by [Shenzhi Wang](https://shenzhi-wang.netlify.app) (η
|
|
25 |
- πππ [Apr. 29, 2024] We now introduce Llama3-8B-Chinese-Chat-**v2**! Compared to v1, the training dataset of v2 is **5 times larger** (~100K preference pairs), and it exhibits significant enhancements, especially in **roleplay**, **function calling**, and **math** capabilities! The training dataset of Llama3-8B-Chinese-Chat-v2 will be released soon. If you love our Llama3-8B-Chinese-Chat-v1, you won't want to miss out on Llama3-8B-Chinese-Chat-v2!
|
26 |
|
27 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
28 |
# 1. Introduction
|
29 |
|
30 |
βοΈβοΈβοΈNOTICE: The main branch contains the 8bit-quantized GGUF version of Llama3-8B-Chinese-Chat-**v2**, if you want to use our 8bit-quantized GGUF version of Llama3-8B-Chinese-Chat-**v1**, please refer to [the `v1` branch](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-8bit/tree/v1).
|
@@ -56,6 +67,9 @@ Training details:
|
|
56 |
- optimizer: paged_adamw_32bit
|
57 |
|
58 |
|
|
|
|
|
|
|
59 |
To reproduce Llama3-8B-Chinese-Chat-**v2** (to reproduce Llama3-8B-Chinese-Chat-**v1**, please refer to [this link](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat/blob/v1/README.md#1-introduction)):
|
60 |
|
61 |
```bash
|
@@ -97,6 +111,9 @@ deepspeed --num_gpus 8 src/train_bash.py \
|
|
97 |
--optim paged_adamw_32bit
|
98 |
```
|
99 |
|
|
|
|
|
|
|
100 |
# 2. Usage
|
101 |
|
102 |
```python
|
|
|
14 |
---
|
15 |
|
16 |
|
|
|
|
|
17 |
π We included all instructions on how to download, use, and reproduce our various kinds of models at [this GitHub repo](https://github.com/Shenzhi-Wang/Llama3-Chinese-Chat). If you like our models, we would greatly appreciate it if you could star our Github repository. Additionally, please click "like" on our HuggingFace repositories. Thank you!
|
18 |
|
19 |
|
|
|
23 |
- πππ [Apr. 29, 2024] We now introduce Llama3-8B-Chinese-Chat-**v2**! Compared to v1, the training dataset of v2 is **5 times larger** (~100K preference pairs), and it exhibits significant enhancements, especially in **roleplay**, **function calling**, and **math** capabilities! The training dataset of Llama3-8B-Chinese-Chat-v2 will be released soon. If you love our Llama3-8B-Chinese-Chat-v1, you won't want to miss out on Llama3-8B-Chinese-Chat-v2!
|
24 |
|
25 |
|
26 |
+
|
27 |
+
# Model Summary
|
28 |
+
|
29 |
+
Llama3-8B-Chinese-Chat is an instruction-tuned language model for Chinese & English users with various abilities such as roleplaying & tool-using built upon the Meta-Llama-3-8B-Instruct model.
|
30 |
+
|
31 |
+
Developed by: [Shenzhi Wang](https://shenzhi-wang.netlify.app) (ηζ
ζ§) and [Yaowei Zheng](https://github.com/hiyouga) (ιθε¨)
|
32 |
+
|
33 |
+
- License: [Llama-3 License](https://llama.meta.com/llama3/license/)
|
34 |
+
- Base Model: Meta-Llama-3-8B-Instruct
|
35 |
+
- Model Size: 8.03B
|
36 |
+
- Context length: 8K
|
37 |
+
|
38 |
+
|
39 |
# 1. Introduction
|
40 |
|
41 |
βοΈβοΈβοΈNOTICE: The main branch contains the 8bit-quantized GGUF version of Llama3-8B-Chinese-Chat-**v2**, if you want to use our 8bit-quantized GGUF version of Llama3-8B-Chinese-Chat-**v1**, please refer to [the `v1` branch](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-8bit/tree/v1).
|
|
|
67 |
- optimizer: paged_adamw_32bit
|
68 |
|
69 |
|
70 |
+
<details>
|
71 |
+
<summary>To reproduce the model</summary>
|
72 |
+
|
73 |
To reproduce Llama3-8B-Chinese-Chat-**v2** (to reproduce Llama3-8B-Chinese-Chat-**v1**, please refer to [this link](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat/blob/v1/README.md#1-introduction)):
|
74 |
|
75 |
```bash
|
|
|
111 |
--optim paged_adamw_32bit
|
112 |
```
|
113 |
|
114 |
+
</details>
|
115 |
+
|
116 |
+
|
117 |
# 2. Usage
|
118 |
|
119 |
```python
|