shenzhi-wang
/

Llama3-8B-Chinese-Chat-GGUF-8bit

@@ -14,8 +14,6 @@ tags:
 ---
-This model is developed by [Shenzhi Wang](https://shenzhi-wang.netlify.app) (王慎执) and [Yaowei Zheng](https://github.com/hiyouga) (郑耀威).
 🌟 We included all instructions on how to download, use, and reproduce our various kinds of models at [this GitHub repo](https://github.com/Shenzhi-Wang/Llama3-Chinese-Chat). If you like our models, we would greatly appreciate it if you could star our Github repository. Additionally, please click "like" on our HuggingFace repositories. Thank you!
@@ -25,6 +23,19 @@ This model is developed by [Shenzhi Wang](https://shenzhi-wang.netlify.app) (王
 - 🚀🚀🚀 [Apr. 29, 2024] We now introduce Llama3-8B-Chinese-Chat-**v2**! Compared to v1, the training dataset of v2 is **5 times larger** (~100K preference pairs), and it exhibits significant enhancements, especially in **roleplay**, **function calling**, and **math** capabilities! The training dataset of Llama3-8B-Chinese-Chat-v2 will be released soon. If you love our Llama3-8B-Chinese-Chat-v1, you won't want to miss out on Llama3-8B-Chinese-Chat-v2!
 # 1. Introduction
 ❗️❗️❗️NOTICE: The main branch contains the 8bit-quantized GGUF version of Llama3-8B-Chinese-Chat-**v2**, if you want to use our 8bit-quantized GGUF version of Llama3-8B-Chinese-Chat-**v1**, please refer to [the `v1` branch](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-8bit/tree/v1).
@@ -56,6 +67,9 @@ Training details:
 - optimizer: paged_adamw_32bit
 To reproduce Llama3-8B-Chinese-Chat-**v2** (to reproduce Llama3-8B-Chinese-Chat-**v1**, please refer to [this link](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat/blob/v1/README.md#1-introduction)):
 ```bash
@@ -97,6 +111,9 @@ deepspeed --num_gpus 8 src/train_bash.py \
     --optim paged_adamw_32bit
 ```
 # 2. Usage
 ```python

 ---
 🌟 We included all instructions on how to download, use, and reproduce our various kinds of models at [this GitHub repo](https://github.com/Shenzhi-Wang/Llama3-Chinese-Chat). If you like our models, we would greatly appreciate it if you could star our Github repository. Additionally, please click "like" on our HuggingFace repositories. Thank you!
 - 🚀🚀🚀 [Apr. 29, 2024] We now introduce Llama3-8B-Chinese-Chat-**v2**! Compared to v1, the training dataset of v2 is **5 times larger** (~100K preference pairs), and it exhibits significant enhancements, especially in **roleplay**, **function calling**, and **math** capabilities! The training dataset of Llama3-8B-Chinese-Chat-v2 will be released soon. If you love our Llama3-8B-Chinese-Chat-v1, you won't want to miss out on Llama3-8B-Chinese-Chat-v2!
+# Model Summary
+Llama3-8B-Chinese-Chat is an instruction-tuned language model for Chinese & English users with various abilities such as roleplaying & tool-using built upon the Meta-Llama-3-8B-Instruct model.
+Developed by: [Shenzhi Wang](https://shenzhi-wang.netlify.app) (王慎执) and [Yaowei Zheng](https://github.com/hiyouga) (郑耀威)
+- License: [Llama-3 License](https://llama.meta.com/llama3/license/)
+- Base Model: Meta-Llama-3-8B-Instruct
+- Model Size: 8.03B
+- Context length: 8K
 # 1. Introduction
 ❗️❗️❗️NOTICE: The main branch contains the 8bit-quantized GGUF version of Llama3-8B-Chinese-Chat-**v2**, if you want to use our 8bit-quantized GGUF version of Llama3-8B-Chinese-Chat-**v1**, please refer to [the `v1` branch](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-8bit/tree/v1).
 - optimizer: paged_adamw_32bit
+<details>
+<summary>To reproduce the model</summary>
 To reproduce Llama3-8B-Chinese-Chat-**v2** (to reproduce Llama3-8B-Chinese-Chat-**v1**, please refer to [this link](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat/blob/v1/README.md#1-introduction)):
 ```bash
     --optim paged_adamw_32bit
 ```
+</details>
 # 2. Usage
 ```python