--- license: llama3 datasets: - yuyijiong/Long-Instruction-with-Paraphrasing language: - zh - en pipeline_tag: text-generation --- # Llama3-8b-chinese-chat-32k ## 训练方式 * 使用 NTK-aware 方法扩展上下文长度至32k * 以 [shenzhi-wang/Llama3-8B-Chinese-Chat](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat) 为基础 在 [Long-Instruction-with-Paraphrasing](https://huggingface.co/datasets/yuyijiong/Long-Instruction-with-Paraphrasing) 数据集上,使用 QLora 微调 1 epoch。 ## 长上下文表现 相比原始版本,拥有更强的长上下文能力 ### LongBench (en) | model | hotpotqa | multifieldqa_en| passage_retrieval_en|qmsum| trec| |---------------------------|-----------|--|--|--|--| | llama3-chinese-8b | 45.88 |50.56|68.0|22.52|73.0| | llama3-8b-chinese-chat-32k| **47.64** |49.98|**100.0**|**25.13**|**75.0**| ### LongBench (zh) | model | dureader | multifieldqa_zh| passage_retrieval_zh|qmsum| trec| |-----------------------------------|-----------|--|--|--|--| | llama3-8b-chinese-chat | 29.08 |58.4|93.5|22.52|73.0| | llama3-8b-chinese-chat-32k | **32.31** |**58.66**|82.5|**25.13**|**75.0**|