Jackrong commited on
Commit
4e7765a
·
verified ·
1 Parent(s): 86626bd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -61
README.md CHANGED
@@ -20,8 +20,8 @@ license: llama3
20
 
21
  # Llama-3.1-8B-Instruct-Elite
22
 
23
- **摘要**
24
- 基于 **Llama-3.1-8B-Instruct** 的中英双语指令微调模型。沿用 *Llama-3.2-3B-Elite* 的训练方法( **Qwen-3-235b-a22b-Instruct-2507** 教师模型蒸馏 + SFT),但**有意减少表情符号**,同时**保留并强化输出格式的专业度**(如小标题加粗、要点列表、清晰段落),以获得更**干净、稳定、易读**的回答。
25
 
26
  <!-- Badges Layout for LLaMA fine-tuned model -->
27
  <div align="center">
@@ -33,7 +33,6 @@ license: llama3
33
  <img alt="GPU" src="https://img.shields.io/badge/GPU-A100_single-3f51b5?style=for-the-badge">
34
 
35
  <!-- Bottom row: extra info -->
36
-
37
  <img alt="Quantization" src="https://img.shields.io/badge/GGUF-Q4_K_M-00acc1?style=for-the-badge">
38
  <img alt="License" src="https://img.shields.io/badge/License-Llama_3.1_Community-ff7043?style=for-the-badge">
39
 
@@ -41,53 +40,54 @@ license: llama3
41
 
42
  ---
43
 
44
- ## 目录
45
- - [模型亮点](#模型亮点)
46
- - [模型概览](#模型概览)
47
- - [训练与数据](#训练与数据)
48
- - [快速开始](#快速开始)
49
- - [适用与限制](#适用与限制)
50
- - [部署与量化](#部署与量化)
51
- - [许可](#许可)
52
- - [致谢](#致谢)
53
- - [引用](#引用)
54
- - [版本记录](#版本记录)
 
55
 
56
  ---
57
 
58
- ## 模型亮点
59
- - **专业且干净**:默认**减少表情符号**;输出以**加粗小标题 + 要点列表**为主,便于复制与二次编辑。
60
- - **结构稳定**:对**分节报告、步骤清单、对照表、要点摘要**等进行了风格与格式对齐。
61
- - **双语/混排优化**:中文、英文与中英混排场景具备良好的术语一致性与层次清晰度。
62
- - **指令遵从更强**:对“不要表情符号 / 只输出要点表 / 保留 Markdown 标题层级”等约束**遵从度更高**。
63
- - **冗长度可控**:默认**减少冗长**,聚焦关键信息并保留必要上下文。
64
 
65
- > **基座**:`meta-llama/Llama-3.1-8B-Instruct`;**训练范式**:教师蒸馏 + SFT
66
 
67
  ---
68
 
69
- ## 模型概览
70
- - **参数规模**:8B
71
- - **任务**:指令跟随 / 对话生成 / 问答 / 摘要 / 结构化输出
72
- - **语言**:中文 & 英文(良好支持中英混排)
73
- - **目标**:在轻量算力下,产出**简洁专业、格式友好**的内容(减少表情符号,保留加粗小标题、要点列表等格式优化)
74
 
75
  ---
76
 
77
- ## 训练与数据
78
- - **数据规模**:约 **80,000** 条高质量指令-回复样本(中英混合,覆盖问答/摘要/说明文/结构化输出/步骤化说明等)。
79
- - **训练方法**:教师**蒸馏** + **SFT**;显式控制**格式/风格**(少表情、强调标题/列表/加粗)。
80
- - **计算资源**:**A100 单卡**;LoRA/QLoRA 可在较短时间内完成若干 epoch。
81
- - **风格与约束**:减少表情符号;强化**小标题加粗**、**要点列表**、**关键术语加粗**与**段落层级**。
82
 
83
- > 如发布蒸馏数据子集,请在此处补充链接与统计(样本数/语种占比/过滤标准)。
84
 
85
  ---
86
 
87
- ## 快速开始
88
 
89
  <details>
90
- <summary><b>Transformers(推荐)</b></summary>
91
 
92
  ```python
93
  from transformers import AutoModelForCausalLM, AutoTokenizer
@@ -134,50 +134,42 @@ print(outputs[0].outputs[0].text)
134
  </details>
135
 
136
  <details>
137
- <summary><b>llama.cppGGUFQ4_K_M)</b></summary>
138
 
139
  ```bash
140
- ./main -m Llama-3.1-8B-Instruct-Elite.Q4_K_M.gguf \
141
- -p "以要点说明:如何将技术文章改写得更专业且干净?"
142
  ```
143
  </details>
144
 
145
  ---
146
 
147
- ## 提示词与输出规范
148
- - 使用**简洁标题**与**加粗小标题**组织结构;关键术语与结论适度**加粗**。
149
- - 以**要点列表**呈现步骤与要点;**默认避免表情符号**。
150
- - **采样建议**:`temperature=0.6–0.8`、`top_p=0.9–0.95`。
151
-
152
- ---
153
-
154
- ## 适用与限制
155
- **适用**:中/英文或中英混排的**问答、摘要、说明文、技术/业务写作**;**结构化输出**(计划、步骤、表格、FAQ、会议纪要)。
156
- **限制**:强事实性、需**最新信息**的任务建议配合检索;医疗/法律/投资等**高风险**输出需人工校对;不得用于违法或伤害性用途。
157
 
158
  ---
159
 
160
- ## 部署与量化
161
- - **Transformers**:建议 `torch.bfloat16/float16`;低显存/CPU 可考虑 4/5/6/8-bit(`bitsandbytes`、AQLM、AutoGPTQ、vLLM 等)。
162
- - **GGUF**:当前**提供 Q4_K_M**;如发布更多档位,请同步在 README 标注 **上下文长度**、**RoPE** **SHA256**。
163
- - **校验**:`shasum -a 256 <filename>`
164
 
165
  ---
166
 
167
- ## 许可
168
- - **模型权重**:遵循 **Llama 3.1 Community License**(与基座一致)。
169
- - **代码/脚本**:可使用 **Apache-2.0** 等;不改变权重许可。
170
 
171
  ---
172
 
173
- ## 致谢
174
- - Meta 提供 **Llama-3.1** 与生态工具链
175
- - 开源社区在**蒸馏、SFT、评测、部署**方面的贡献
176
- - 训练方法与实践沿用自 *Llama-3.2-3B-Elite*
177
 
178
  ---
179
 
180
- ## 引用
181
  ```bibtex
182
  @misc{JackrongL31_8B_Elite,
183
  title = {Jackrong/Llama-3.1-8B-Instruct-Elite},
@@ -189,5 +181,5 @@ print(outputs[0].outputs[0].text)
189
 
190
  ---
191
 
192
- ## 版本记录
193
- - v1.0:首次发布。数据规模约 **80k**;**A100 单卡**训练;提供 **GGUF Q4_K_M**;减少表情符号;强化小标题加粗与要点列表;训练配方与 3.2-3B-Elite 一致。
 
20
 
21
  # Llama-3.1-8B-Instruct-Elite
22
 
23
+ **Abstract**
24
+ A bilingual (Chinese/English) instruction-tuned model based on **Llama-3.1-8B-Instruct**. It follows the training recipe of *Llama-3.2-3B-Elite* (Qwen-3-235b-a22b-Instruct-2507 as teacher for distillation + SFT), but intentionally reduces emojis (From Qwen3 teacher) while retaining and reinforcing professional formatting (e.g., bolded subheadings, bullet lists, clear paragraphs) to produce answers that are cleaner, more stable, and easier to read.
25
 
26
  <!-- Badges Layout for LLaMA fine-tuned model -->
27
  <div align="center">
 
33
  <img alt="GPU" src="https://img.shields.io/badge/GPU-A100_single-3f51b5?style=for-the-badge">
34
 
35
  <!-- Bottom row: extra info -->
 
36
  <img alt="Quantization" src="https://img.shields.io/badge/GGUF-Q4_K_M-00acc1?style=for-the-badge">
37
  <img alt="License" src="https://img.shields.io/badge/License-Llama_3.1_Community-ff7043?style=for-the-badge">
38
 
 
40
 
41
  ---
42
 
43
+ ## Table of Contents
44
+ - [Highlights](#highlights)
45
+ - [Model Overview](#model-overview)
46
+ - [Training & Data](#training--data)
47
+ - [Quickstart](#quickstart)
48
+ - [Prompting & Output Conventions](#prompting--output-conventions)
49
+ - [Use Cases & Limitations](#use-cases--limitations)
50
+ - [Deployment & Quantization](#deployment--quantization)
51
+ - [License](#license)
52
+ - [Acknowledgments](#acknowledgments)
53
+ - [Citation](#citation)
54
+ - [Changelog](#changelog)
55
 
56
  ---
57
 
58
+ ## Highlights
59
+ - **Professional and clean**: fewer emojis by default; outputs emphasize bolded subheadings + bullet lists, making content easy to copy and further edit.
60
+ - **Stable structure**: Consistent formatting for sectioned reports, step checklists, comparison tables, and key-point summaries.
61
+ - **Bilingual / mixed text friendly**: Strong terminology coherence and clear hierarchy for Chinese, English, and mixed Chinese–English scenarios.
62
+ - **Stronger instruction-following**: Higher adherence to constraints such as “no emojis,” “only output key-point tables,” and “preserve Markdown heading levels.”
63
+ - **Controllable verbosity**: Defaults to less verbosity, focusing on key information while keeping necessary context.
64
 
65
+ > **Base**: `meta-llama/Llama-3.1-8B-Instruct`; **Training paradigm**: Teacher distillation + SFT.
66
 
67
  ---
68
 
69
+ ## Model Overview
70
+ - **Parameters**: 8B
71
+ - **Tasks**: Instruction following / Dialogue generation / Q&A / Summarization / Structured output
72
+ - **Languages**: Chinese & English (robust for mixed Chinese–English)
73
+ - **Goal**: Deliver concise, professional, and format-friendly content on modest compute (reduced emojis; keep bolded subheadings, bullet lists, and other formatting enhancements).
74
 
75
  ---
76
 
77
+ ## Training & Data
78
+ - **Data size**: About **80,000** high-quality instruction–response pairs (Chinese/English mix covering Q&A, summarization, expository writing, structured output, procedural steps, etc.).
79
+ - **Method**: **Distillation** from a teacher model + **SFT**; explicit **format/style control** (fewer emojis; emphasize headings/lists/bold).
80
+ - **Compute**: **Single A100**; LoRA/QLoRA can complete several epochs within a short time.
81
+ - **Style & constraints**: Fewer emojis; strengthened bold subheadings, bullet lists, bold key terms, and clear paragraph hierarchy.
82
 
83
+ > If a distilled-data subset is released, add links and stats here (sample counts / language ratios / filtering rules).
84
 
85
  ---
86
 
87
+ ## Quickstart
88
 
89
  <details>
90
+ <summary><b>Transformers (recommended)</b></summary>
91
 
92
  ```python
93
  from transformers import AutoModelForCausalLM, AutoTokenizer
 
134
  </details>
135
 
136
  <details>
137
+ <summary><b>llama.cpp (GGUF: Q4_K_M)</b></summary>
138
 
139
  ```bash
140
+ ./main -m Llama-3.1-8B-Instruct-Elite.Q4_K_M.gguf -p "以要点说明:如何将技术文章改写得更专业且干净?"
 
141
  ```
142
  </details>
143
 
144
  ---
145
 
146
+ ## Prompting & Output Conventions
147
+ - Organize with concise headings and bolded subheadings; bold key terms and conclusions where helpful.
148
+ - Use bullet lists for steps and key points; avoid emojis by default.
149
+ - **Sampling tips**: `temperature=0.6–0.8`, `top_p=0.9–0.95`.
 
 
 
 
 
 
150
 
151
  ---
152
 
153
+ ## Use Cases & Limitations
154
+ **Use cases**: Chinese/English or mixed bilingual **Q&A, summarization, instructional/technical/business writing**; **structured outputs** (plans, steps, tables, FAQs, meeting minutes).
155
+ **Limitations**: For high-factuality tasks that require **up-to-date information**, pair with retrieval; for medical/legal/financial or other **high-risk** scenarios, use human review; do not use for illegal or harmful purposes.
 
156
 
157
  ---
158
 
159
+ ## License
160
+ - **Model weights**: **Llama 3.1 Community License** (same as base).
161
+ - **Code/scripts**: May use **Apache-2.0** or similar; the weight license remains unchanged.
162
 
163
  ---
164
 
165
+ ## Acknowledgments
166
+ - Meta for **Llama-3.1** and the broader ecosystem
167
+ - Open-source community contributions to **distillation, SFT, evaluation, and deployment**
168
+ - Training recipe and practices adapted from *Llama-3.2-3B-Elite*
169
 
170
  ---
171
 
172
+ ## Citation
173
  ```bibtex
174
  @misc{JackrongL31_8B_Elite,
175
  title = {Jackrong/Llama-3.1-8B-Instruct-Elite},
 
181
 
182
  ---
183
 
184
+ ## Changelog
185
+ - v1.0: Initial release. ~**80k** samples; trained on a **single A100**; provides **GGUF Q4_K_M**; fewer emojis; strengthened bold subheadings and bullet lists; training recipe aligned with 3.2-3B-Elite.