Update README.md

Browse files

Files changed (1) hide show

README.md +53 -61

README.md CHANGED Viewed

@@ -20,8 +20,8 @@ license: llama3
 # Llama-3.1-8B-Instruct-Elite
-**摘要**
-基于 **Llama-3.1-8B-Instruct** 的中英双语指令微调模型。沿用 *Llama-3.2-3B-Elite* 的训练方法（ **Qwen-3-235b-a22b-Instruct-2507** 教师模型蒸馏 + SFT），但**有意减少表情符号**，同时**保留并强化输出格式的专业度**（如小标题加粗、要点列表、清晰段落），以获得更**干净、稳定、易读**的回答。
 <!-- Badges Layout for LLaMA fine-tuned model -->
 <div align="center">
@@ -33,7 +33,6 @@ license: llama3
   <img alt="GPU" src="https://img.shields.io/badge/GPU-A100_single-3f51b5?style=for-the-badge">
   <!-- Bottom row: extra info -->
   <img alt="Quantization" src="https://img.shields.io/badge/GGUF-Q4_K_M-00acc1?style=for-the-badge">
   <img alt="License" src="https://img.shields.io/badge/License-Llama_3.1_Community-ff7043?style=for-the-badge">
@@ -41,53 +40,54 @@ license: llama3
 ---
-## 目录
-- [模型亮点](#模型亮点)
-- [模型概览](#模型概览)
-- [训练与数据](#训练与数据)
-- [快速开始](#快速开始)
-- [适用与限制](#适用与限制)
-- [部署与量化](#部署与量化)
-- [许可](#许可)
-- [致谢](#致谢)
-- [引用](#引用)
-- [版本记录](#版本记录)
 ---
-## 模型亮点
-- **专业且干净**：默认**减少表情符号**；输出以**加粗小标题 + 要点列表**为主，便于复制与二次编辑。
-- **结构稳定**：对**分节报告、步骤清单、对照表、要点摘要**等进行了风格与格式对齐。
-- **双语/混排优化**：中文、英文与中英混排场景具备良好的术语一致性与层次清晰度。
-- **指令遵从更强**：对“不要表情符号 / 只输出要点表 / 保留 Markdown 标题层级”等约束**遵从度更高**。
-- **冗长度可控**：默认**减少冗长**，聚焦关键信息并保留必要上下文。
-> **基座**：`meta-llama/Llama-3.1-8B-Instruct`；**训练范式**：教师蒸馏 + SFT。
 ---
-## 模型概览
-- **参数规模**：8B
-- **任务**：指令跟随 / 对话生成 / 问答 / 摘要 / 结构化输出
-- **语言**：中文 & 英文（良好支持中英混排）
-- **目标**：在轻量算力下，产出**简洁专业、格式友好**的内容（减少表情符号，保留加粗小标题、要点列表等格式优化）
 ---
-## 训练与数据
-- **数据规模**：约 **80,000** 条高质量指令-回复样本（中英混合，覆盖问答/摘要/说明文/结构化输出/步骤化说明等）。
-- **训练方法**：教师**蒸馏** + **SFT**；显式控制**格式/风格**（少表情、强调标题/列表/加粗）。
-- **计算资源**：**A100 单卡**；LoRA/QLoRA 可在较短时间内完成若干 epoch。
-- **风格与约束**：减少表情符号；强化**小标题加粗**、**要点列表**、**关键术语加粗**与**段落层级**。
-> 如发布蒸馏数据子集，请在此处补充链接与统计（样本数/语种占比/过滤标准）。
 ---
-## 快速开始
 <details>
-<summary><b>Transformers（推荐）</b></summary>
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
@@ -134,50 +134,42 @@ print(outputs[0].outputs[0].text)
 </details>
 <details>
-<summary><b>llama.cpp（GGUF：Q4_K_M）</b></summary>
 ```bash
-./main -m Llama-3.1-8B-Instruct-Elite.Q4_K_M.gguf \
-  -p "以要点说明：如何将技术文章改写得更专业且干净？"
 ```
 </details>
 ---
-## 提示词与输出规范
-- 使用**简洁标题**与**加粗小标题**组织结构；关键术语与结论适度**加粗**。
-- 以**要点列表**呈现步骤与要点；**默认避免表情符号**。
-- **采样建议**：`temperature=0.6–0.8`、`top_p=0.9–0.95`。
----
-## 适用与限制
-**适用**：中/英文或中英混排的**问答、摘要、说明文、技术/业务写作**；**结构化输出**（计划、步骤、表格、FAQ、会议纪要）。
-**限制**：强事实性、需**最新信息**的任务建议配合检索；医疗/法律/投资等**高风险**输出需人工校对；不得用于违法或伤害性用途。
 ---
-## 部署与量化
-- **Transformers**：建议 `torch.bfloat16/float16`；低显存/CPU 可考虑 4/5/6/8-bit（`bitsandbytes`、AQLM、AutoGPTQ、vLLM 等）。
-- **GGUF**：当前**提供 Q4_K_M**；如发布更多档位，请同步在 README 标注 **上下文长度**、**RoPE** 与 **SHA256**。
-- **校验**：`shasum -a 256 <filename>`
 ---
-## 许可
-- **模型权重**：遵循 **Llama 3.1 Community License**（与基座一致）。
-- **代码/脚本**：可使用 **Apache-2.0** 等；不改变权重许可。
 ---
-## 致谢
-- Meta 提供 **Llama-3.1** 与生态工具链
-- 开源社区在**蒸馏、SFT、评测、部署**方面的贡献
-- 训练方法与实践沿用自 *Llama-3.2-3B-Elite*
 ---
-## 引用
 ```bibtex
 @misc{JackrongL31_8B_Elite,
   title  = {Jackrong/Llama-3.1-8B-Instruct-Elite},
@@ -189,5 +181,5 @@ print(outputs[0].outputs[0].text)
 ---
-## 版本记录
-- v1.0：首次发布。数据规模约 **80k**；**A100 单卡**训练；提供 **GGUF Q4_K_M**；减少表情符号；强化小标题加粗与要点列表；训练配方与 3.2-3B-Elite 一致。

 # Llama-3.1-8B-Instruct-Elite
+**Abstract**
+A bilingual (Chinese/English) instruction-tuned model based on **Llama-3.1-8B-Instruct**. It follows the training recipe of *Llama-3.2-3B-Elite* (Qwen-3-235b-a22b-Instruct-2507 as teacher for distillation + SFT), but intentionally reduces emojis (From Qwen3 teacher) while retaining and reinforcing professional formatting (e.g., bolded subheadings, bullet lists, clear paragraphs) to produce answers that are cleaner, more stable, and easier to read.
 <!-- Badges Layout for LLaMA fine-tuned model -->
 <div align="center">
   <img alt="GPU" src="https://img.shields.io/badge/GPU-A100_single-3f51b5?style=for-the-badge">
   <!-- Bottom row: extra info -->
   <img alt="Quantization" src="https://img.shields.io/badge/GGUF-Q4_K_M-00acc1?style=for-the-badge">
   <img alt="License" src="https://img.shields.io/badge/License-Llama_3.1_Community-ff7043?style=for-the-badge">
 ---
+## Table of Contents
+- [Highlights](#highlights)
+- [Model Overview](#model-overview)
+- [Training & Data](#training--data)
+- [Quickstart](#quickstart)
+- [Prompting & Output Conventions](#prompting--output-conventions)
+- [Use Cases & Limitations](#use-cases--limitations)
+- [Deployment & Quantization](#deployment--quantization)
+- [License](#license)
+- [Acknowledgments](#acknowledgments)
+- [Citation](#citation)
+- [Changelog](#changelog)
 ---
+## Highlights
+- **Professional and clean**: fewer emojis by default; outputs emphasize bolded subheadings + bullet lists, making content easy to copy and further edit.
+- **Stable structure**: Consistent formatting for sectioned reports, step checklists, comparison tables, and key-point summaries.
+- **Bilingual / mixed text friendly**: Strong terminology coherence and clear hierarchy for Chinese, English, and mixed Chinese–English scenarios.
+- **Stronger instruction-following**: Higher adherence to constraints such as “no emojis,” “only output key-point tables,” and “preserve Markdown heading levels.”
+- **Controllable verbosity**: Defaults to less verbosity, focusing on key information while keeping necessary context.
+> **Base**: `meta-llama/Llama-3.1-8B-Instruct`; **Training paradigm**: Teacher distillation + SFT.
 ---
+## Model Overview
+- **Parameters**: 8B
+- **Tasks**: Instruction following / Dialogue generation / Q&A / Summarization / Structured output
+- **Languages**: Chinese & English (robust for mixed Chinese–English)
+- **Goal**: Deliver concise, professional, and format-friendly content on modest compute (reduced emojis; keep bolded subheadings, bullet lists, and other formatting enhancements).
 ---
+## Training & Data
+- **Data size**: About **80,000** high-quality instruction–response pairs (Chinese/English mix covering Q&A, summarization, expository writing, structured output, procedural steps, etc.).
+- **Method**: **Distillation** from a teacher model + **SFT**; explicit **format/style control** (fewer emojis; emphasize headings/lists/bold).
+- **Compute**: **Single A100**; LoRA/QLoRA can complete several epochs within a short time.
+- **Style & constraints**: Fewer emojis; strengthened bold subheadings, bullet lists, bold key terms, and clear paragraph hierarchy.
+> If a distilled-data subset is released, add links and stats here (sample counts / language ratios / filtering rules).
 ---
+## Quickstart
 <details>
+<summary><b>Transformers (recommended)</b></summary>
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 </details>
 <details>
+<summary><b>llama.cpp (GGUF: Q4_K_M)</b></summary>
 ```bash
+./main -m Llama-3.1-8B-Instruct-Elite.Q4_K_M.gguf   -p "以要点说明：如何将技术文章改写得更专业且干净？"
 ```
 </details>
 ---
+## Prompting & Output Conventions
+- Organize with concise headings and bolded subheadings; bold key terms and conclusions where helpful.
+- Use bullet lists for steps and key points; avoid emojis by default.
+- **Sampling tips**: `temperature=0.6–0.8`, `top_p=0.9–0.95`.
 ---
+## Use Cases & Limitations
+**Use cases**: Chinese/English or mixed bilingual **Q&A, summarization, instructional/technical/business writing**; **structured outputs** (plans, steps, tables, FAQs, meeting minutes).
+**Limitations**: For high-factuality tasks that require **up-to-date information**, pair with retrieval; for medical/legal/financial or other **high-risk** scenarios, use human review; do not use for illegal or harmful purposes.
 ---
+## License
+- **Model weights**: **Llama 3.1 Community License** (same as base).
+- **Code/scripts**: May use **Apache-2.0** or similar; the weight license remains unchanged.
 ---
+## Acknowledgments
+- Meta for **Llama-3.1** and the broader ecosystem
+- Open-source community contributions to **distillation, SFT, evaluation, and deployment**
+- Training recipe and practices adapted from *Llama-3.2-3B-Elite*
 ---
+## Citation
 ```bibtex
 @misc{JackrongL31_8B_Elite,
   title  = {Jackrong/Llama-3.1-8B-Instruct-Elite},
 ---
+## Changelog
+- v1.0: Initial release. ~**80k** samples; trained on a **single A100**; provides **GGUF Q4_K_M**; fewer emojis; strengthened bold subheadings and bullet lists; training recipe aligned with 3.2-3B-Elite.