nielsr HF Staff commited on
Commit
cc94bbc
·
verified ·
1 Parent(s): f24b538

Add link to paper and mention it in the description

Browse files

This PR improves the model card by adding a link to the paper and mentioning the paper in the description.

Files changed (1) hide show
  1. README.md +538 -196
README.md CHANGED
@@ -1,196 +1,538 @@
1
- ---
2
- license: apache-2.0
3
- language:
4
- - zh
5
- - en
6
- pipeline_tag: text-generation
7
- library_name: transformers
8
- ---
9
- <div align="center">
10
- <img src="https://github.com/OpenBMB/MiniCPM/blob/main/assets/minicpm_logo.png?raw=true" width="500em" ></img>
11
- </div>
12
-
13
- <p align="center">
14
- <a href="https://github.com/OpenBMB/MiniCPM/" target="_blank">GitHub Repo</a> |
15
- <a href="https://github.com/OpenBMB/MiniCPM/tree/main/report/MiniCPM_4_Technical_Report.pdf" target="_blank">Technical Report</a>
16
- </p>
17
- <p align="center">
18
- 👋 Join us on <a href="https://discord.gg/3cGQn9b3YM" target="_blank">Discord</a> and <a href="https://github.com/OpenBMB/MiniCPM/blob/main/assets/wechat.jpg" target="_blank">WeChat</a>
19
- </p>
20
-
21
- ## What's New
22
-
23
- * [2025-06-05] 🚀🚀🚀 We have open-sourced **MiniCPM4-Survey**, a model built upon MiniCPM4-8B that is capable of generating trustworthy, long-form survey papers while maintaining competitive performance relative to significantly larger models.
24
-
25
- ## MiniCPM4 Series
26
- MiniCPM4 series are highly efficient large language models (LLMs) designed explicitly for end-side devices, which achieves this efficiency through systematic innovation in four key dimensions: model architecture, training data, training algorithms, and inference systems.
27
- - [MiniCPM4-8B](https://huggingface.co/openbmb/MiniCPM4-8B): The flagship of MiniCPM4, with 8B parameters, trained on 8T tokens.
28
- - [MiniCPM4-0.5B](https://huggingface.co/openbmb/MiniCPM4-0.5B): The small version of MiniCPM4, with 0.5B parameters, trained on 1T tokens.
29
- - [MiniCPM4-8B-Eagle-FRSpec](https://huggingface.co/openbmb/MiniCPM4-8B-Eagle-FRSpec): Eagle head for FRSpec, accelerating speculative inference for MiniCPM4-8B.
30
- - [MiniCPM4-8B-Eagle-FRSpec-QAT-cpmcu](https://huggingface.co/openbmb/MiniCPM4-8B-Eagle-FRSpec-QAT-cpmcu): Eagle head trained with QAT for FRSpec, efficiently integrate speculation and quantization to achieve ultra acceleration for MiniCPM4-8B.
31
- - [MiniCPM4-8B-Eagle-vLLM](https://huggingface.co/openbmb/MiniCPM4-8B-Eagle-vLLM): Eagle head in vLLM format, accelerating speculative inference for MiniCPM4-8B.
32
- - [MiniCPM4-8B-marlin-Eagle-vLLM](https://huggingface.co/openbmb/MiniCPM4-8B-marlin-Eagle-vLLM): Quantized Eagle head for vLLM format, accelerating speculative inference for MiniCPM4-8B.
33
- - [BitCPM4-0.5B](https://huggingface.co/openbmb/BitCPM4-0.5B): Extreme ternary quantization applied to MiniCPM4-0.5B compresses model parameters into ternary values, achieving a 90% reduction in bit width.
34
- - [BitCPM4-1B](https://huggingface.co/openbmb/BitCPM4-1B): Extreme ternary quantization applied to MiniCPM3-1B compresses model parameters into ternary values, achieving a 90% reduction in bit width.
35
- - [MiniCPM4-Survey](https://huggingface.co/openbmb/MiniCPM4-Survey): Based on MiniCPM4-8B, accepts users' quiries as input and autonomously generate trustworthy, long-form survey papers. (**<-- you are here**)
36
- - [MiniCPM4-MCP](https://huggingface.co/openbmb/MiniCPM4-MCP): Based on MiniCPM4-8B, accepts users' queries and available MCP tools as input and autonomously calls relevant MCP tools to satisfy users' requirements.
37
-
38
- ## Overview
39
-
40
- **MiniCPM4-Survey** is an open-source LLM agent model jointly developed by [THUNLP](https://nlp.csai.tsinghua.edu.cn), Renmin University of China and [ModelBest](https://modelbest.cn/en). Built on [MiniCPM4](https://github.com/OpenBMB/MiniCPM4) with 8 billion parameters, it accepts users' quiries as input and autonomously generate trustworthy, long-form survey papers.
41
-
42
- Key features include:
43
-
44
- - **Plan-Retrieve-Write Survey Generation Framework** We propose a multi-agent generation framework, which operates through three core stages: planning (defining the overall structure of the survey), retrieval (generating appropriate retrieval keywords), and writing (synthesizing the retrieved information to generate coherent section-level content).
45
-
46
- - **High-Quality Dataset Construction** — We gather and process lots of expert-written survey papers to construct a high-quality training dataset. Meanwhile, we collect a large number of research papers to build a retrieval database.
47
-
48
- - **Multi-Aspect Reward Design** — We carefully design a reward system with three aspects (structure, content, and citations) to evaluate the quality of the surveys, which is used as the reward function in the RL training stage.
49
-
50
- - **Multi-Step RL Training Strategy** — We propose a *Context Manager* to ensure retention of essential information while facilitating efficient reasoning, and we construct *Parallel Environment* to maintain efficient RL training cycles.
51
-
52
-
53
- ## Quick Start
54
-
55
- ### Download the model
56
-
57
- Download [MiniCPM4-Survey](https://huggingface.co/openbmb/MiniCPM4-Survey) from Hugging Face and place it in `model/MiniCPM4-Survey`.
58
- We recommend using [MiniCPM-Embedding-Light](https://huggingface.co/openbmb/MiniCPM-Embedding-Light) as the embedding model, which can be downloaded from Hugging Face and placed in `model/MiniCPM-Embedding-Light`.
59
- ### Perpare the environment
60
-
61
- You can download the [paper data](https://www.kaggle.com/datasets/Cornell-University/arxiv) from Kaggle, then extract it. You can run `python data_process.py` to process the data and generate the retrieval database. Then you can run `python build_index.py` to build the retrieval database.
62
-
63
- ```
64
- cd ./code
65
- curl -L -o ~/Downloads/arxiv.zip\
66
- https://www.kaggle.com/api/v1/datasets/download/Cornell-University/arxiv
67
- unzip ~/Downloads/arxiv.zip -d .
68
- mkdir data
69
- python ./src/preprocess/data_process.py
70
- mkdir index
71
- python ./src/preprocess/build_index.py
72
- ```
73
-
74
- ### Model Inference
75
-
76
- You can run the following command to build the retrieval environment and start the inference:
77
-
78
- ```bash
79
- cd ./code
80
- python ./src/retriever.py
81
- bash ./scripts/run.sh
82
- ```
83
-
84
- If you want to run with the frontend, you can run the following command:
85
-
86
- ```bash
87
- cd ./code
88
- python ./src/retriever.py
89
- bash ./scripts/run_with_frontend.sh
90
- cd frontend/minicpm4-survey
91
- npm install
92
- npm run dev
93
- ```
94
-
95
- Then you can visit `http://localhost:5173` in your browser to use the model.
96
-
97
- ## Performance Evaluation
98
-
99
- | Method | Relevance | Coverage | Depth | Novelty | Avg. | Fact Score |
100
- |---------------------------------------------|-----------|----------|-------|---------|-------|------------|
101
- | Naive RAG (driven by G2FT) | 3.25 | 2.95 | 3.35 | 2.60 | 3.04 | 43.68 |
102
- | AutoSurvey (driven by G2FT) | 3.10 | 3.25 | 3.15 | **3.15**| 3.16 | 46.56 |
103
- | Webthinker (driven by WTR1-7B) | 3.30 | 3.00 | 2.75 | 2.50 | 2.89 | -- |
104
- | Webthinker (driven by QwQ-32B) | 3.40 | 3.30 | 3.30 | 2.50 | 3.13 | -- |
105
- | OpenAI Deep Research (driven by GPT-4o) | 3.50 |**3.95** | 3.55 | 3.00 | **3.50** | -- |
106
- | MiniCPM4-Survey | 3.45 | 3.70 | **3.85** | 3.00 | **3.50** | **68.73** |
107
- | &nbsp;&nbsp;&nbsp;*w/o* RL | **3.55** | 3.35 | 3.30 | 2.25 | 3.11 | 50.24 |
108
-
109
- *Performance comparison of the survey generation systems. "G2FT" stands for Gemini-2.0-Flash-Thinking, and "WTR1-7B" denotes Webthinker-R1-7B. FactScore evaluation was omitted for Webthinker, as it does not include citation functionality, and for OpenAI Deep Research, which does not provide citations when exporting the results.*
110
-
111
- ## Statement
112
- - As a language model, MiniCPM generates content by learning from a vast amount of text.
113
- - However, it does not possess the ability to comprehend or express personal opinions or value judgments.
114
- - Any content generated by MiniCPM does not represent the viewpoints or positions of the model developers.
115
- - Therefore, when using content generated by MiniCPM, users should take full responsibility for evaluating and verifying it on their own.
116
-
117
- ## LICENSE
118
- - This repository and MiniCPM models are released under the [Apache-2.0](https://github.com/OpenBMB/MiniCPM/blob/main/LICENSE) License.
119
-
120
- ## Citation
121
- - Please cite our [paper](https://github.com/OpenBMB/MiniCPM/tree/main/report/MiniCPM_4_Technical_Report.pdf) if you find our work valuable.
122
-
123
- ```bibtex
124
- @article{minicpm4,
125
- title={{MiniCPM4}: Ultra-Efficient LLMs on End Devices},
126
- author={MiniCPM Team},
127
- year={2025}
128
- }
129
- ```
130
-
131
- # 中文
132
- ## News
133
-
134
- * [2025-06-05] 🚀🚀🚀我们开源了基于MiniCPM4-8B构建的MiniCPM4-Survey,能够生成可信的长篇调查报告,性能比肩更大模型。
135
-
136
- ## 概览
137
-
138
- MiniCPM4-Survey是由[THUNLP](https://nlp.csai.tsinghua.edu.cn)、中国人民大学和[ModelBest](https://modelbest.cn)联合开发的开源大语言模型智能体。它基于[MiniCPM4](https://github.com/OpenBMB/MiniCPM4) 80亿参数基座模型,接受用户质量作为输入,自主生成可信的长篇综述论文。
139
-
140
- 主要特性包括:
141
- - 计划-检索-写作生成框架 — 我们提出了一个多智能体生成框架,包含三个核心阶段:计划(定义综述的整体结构)、检索(生成合适的检索关键词)和写作(利用检索到的信息,生成连贯的段落)。
142
- - 高质量数据集构建——我们收集并处理大量人类专家写作的综述论文,构建高质量训练集。同时,我们收集大量研究论文,构建检索数据库。
143
- - 多方面奖励设计 — 我们精心设计了包含结构、内容和引用的奖励,用于评估综述的质量,在强化学习训练阶段作奖励函数。
144
- - 多步强化学习训练策略 — 我们提出了一个上下文管理器,以确保在促进有效推理的同时保留必要的信息,并构建了并行环境,维持强化学习训练高效。
145
-
146
-
147
- ## 使用
148
-
149
- ### 下载模型
150
- 从 Hugging Face 下载[MiniCPM4-Survey](https://huggingface.co/openbmb/MiniCPM4-Survey)并将其放在model/MiniCPM4-Survey中。
151
- 我们建议使用[MiniCPM-Embedding-Light](https://huggingface.co/openbmb/MiniCPM-Embedding-Light)作为表���模型,放在model/MiniCPM-Embedding-Light中。
152
-
153
- ### 准备环境
154
- Kaggle 下载论文数据,然后解压。运行`python data_process.py`,处理数据并生成检索数据库。然后运行`python build_index.py`,构建检索数据库。
155
- ``` bash
156
- cd ./code
157
- curl -L -o ~/Downloads/arxiv.zip\
158
- https://www.kaggle.com/api/v1/datasets/download/Cornell-University/arxiv
159
- unzip ~/Downloads/arxiv.zip -d .
160
- mkdir data
161
- python ./src/preprocess/data_process.py
162
- mkdir index
163
- python ./src/preprocess/build_index.py
164
- ```
165
-
166
- ### 模型推理
167
- 运行以下命令来构建检索环境并开始推理:
168
- ``` bash
169
- cd ./code
170
- python ./src/retriever.py
171
- bash ./scripts/run.sh
172
- ```
173
- 如果您想使用前端运行,可以运行以下命令:
174
- ``` bash
175
- cd ./code
176
- python ./src/retriever.py
177
- bash ./scripts/run_with_frontend.sh
178
- cd frontend/minicpm4-survey
179
- npm install
180
- npm run dev
181
- ```
182
- 然后你可以在浏览器中访问`http://localhost:5173`使用。
183
-
184
- ## 性能
185
-
186
- | Method | Relevance | Coverage | Depth | Novelty | Avg. | Fact Score |
187
- |---------------------------------------------|-----------|----------|-------|---------|-------|------------|
188
- | Naive RAG (driven by G2FT) | 3.25 | 2.95 | 3.35 | 2.60 | 3.04 | 43.68 |
189
- | AutoSurvey (driven by G2FT) | 3.10 | 3.25 | 3.15 | **3.15**| 3.16 | 46.56 |
190
- | Webthinker (driven by WTR1-7B) | 3.30 | 3.00 | 2.75 | 2.50 | 2.89 | -- |
191
- | Webthinker (driven by QwQ-32B) | 3.40 | 3.30 | 3.30 | 2.50 | 3.13 | -- |
192
- | OpenAI Deep Research (driven by GPT-4o) | 3.50 |**3.95** | 3.55 | 3.00 | **3.50** | -- |
193
- | MiniCPM4-Survey | 3.45 | 3.70 | **3.85** | 3.00 | **3.50** | **68.73** |
194
- | &nbsp;&nbsp;&nbsp;*w/o* RL | **3.55** | 3.35 | 3.30 | 2.25 | 3.11 | 50.24 |
195
-
196
- *GPT-4o对综述生成系统的性能比较。“G2FT”代表Gemini-2.0-Flash-Thinking,“WTR1-7B”代表Webthinker-R1-7B。由于Webthinker不包括引用功能,OpenAI Deep Research在导出结果时不提供引用,因此省略了对它们的FactScore评估。我们的技术报告中包含评测的详细信息。*
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - zh
4
+ - en
5
+ library_name: transformers
6
+ license: apache-2.0
7
+ pipeline_tag: text-generation
8
+ ---
9
+
10
+ <div align="center">
11
+ <img src="https://github.com/OpenBMB/MiniCPM/blob/main/assets/minicpm_logo.png?raw=true" width="500em" ></img>
12
+ </div>
13
+
14
+ <p align="center">
15
+ <a href="https://github.com/OpenBMB/MiniCPM/\" target="_blank">GitHub Repo</a> |
16
+ <a href="https://github.com/OpenBMB/MiniCPM/tree/main/report/MiniCPM_4_Technical_Report.pdf" target="_blank">Technical Report</a> |
17
+ <a href="https://huggingface.co/papers/2506.07900" target="_blank">Paper</a>
18
+ </p>
19
+ <p align="center">
20
+ 👋 Join us on <a href="https://discord.gg/3cGQn9b3YM" target="_blank">Discord</a> and <a href="https://github.com/OpenBMB/MiniCPM/blob/main/assets/wechat.jpg" target="_blank">WeChat</a>
21
+ </p>
22
+
23
+ This repository contains the model described in the paper [MiniCPM4: Ultra-Efficient LLMs on End Devices](https://huggingface.co/papers/2506.07900).
24
+
25
+ ## What's New
26
+
27
+ * [2025-06-05] 🚀🚀🚀 We have open-sourced **MiniCPM4-Survey**, a model built upon MiniCPM4-8B that is capable of generating trustworthy, long-form survey papers while maintaining competitive performance relative to significantly larger models.
28
+
29
+ ## MiniCPM4 Series
30
+ MiniCPM4 series are highly efficient large language models (LLMs) designed explicitly for end-side devices, which achieves this efficiency through systematic innovation in four key dimensions: model architecture, training data, training algorithms, and inference systems.
31
+ - [MiniCPM4-8B](https://huggingface.co/openbmb/MiniCPM4-8B): The flagship of MiniCPM4, with 8B parameters, trained on 8T tokens.
32
+ - [MiniCPM4-0.5B](https://huggingface.co/openbmb/MiniCPM4-0.5B): The small version of MiniCPM4, with 0.5B parameters, trained on 1T tokens.
33
+ - [MiniCPM4-8B-Eagle-FRSpec](https://huggingface.co/openbmb/MiniCPM4-8B-Eagle-FRSpec): Eagle head for FRSpec, accelerating speculative inference for MiniCPM4-8B.
34
+ - [MiniCPM4-8B-Eagle-FRSpec-QAT-cpmcu](https://huggingface.co/openbmb/MiniCPM4-8B-Eagle-FRSpec-QAT-cpmcu): Eagle head trained with QAT for FRSpec, efficiently integrate speculation and quantization to achieve ultra acceleration for MiniCPM4-8B.
35
+ - [MiniCPM4-8B-Eagle-vLLM](https://huggingface.co/openbmb/MiniCPM4-8B-Eagle-vLLM): Eagle head in vLLM format, accelerating speculative inference for MiniCPM4-8B.
36
+ - [MiniCPM4-8B-marlin-Eagle-vLLM](https://huggingface.co/openbmb/MiniCPM4-8B-marlin-Eagle-vLLM): Quantized Eagle head for vLLM format, accelerating speculative inference for MiniCPM4-8B.
37
+ - [BitCPM4-0.5B](https://huggingface.co/openbmb/BitCPM4-0.5B): Extreme ternary quantization applied to MiniCPM4-0.5B compresses model parameters into ternary values, achieving a 90% reduction in bit width.
38
+ - [BitCPM4-1B](https://huggingface.co/openbmb/BitCPM4-1B): Extreme ternary quantization applied to MiniCPM3-1B compresses model parameters into ternary values, achieving a 90% reduction in bit width.
39
+ - [MiniCPM4-Survey](https://huggingface.co/openbmb/MiniCPM4-Survey): Based on MiniCPM4-8B, accepts users' quiries as input and autonomously generate trustworthy, long-form survey papers. (**<-- you are here**)
40
+ - [MiniCPM4-MCP](https://huggingface.co/openbmb/MiniCPM4-MCP): Based on MiniCPM4-8B, accepts users' queries and available MCP tools as input and autonomously calls relevant MCP tools to satisfy users' requirements.
41
+
42
+ ## Overview
43
+
44
+ **MiniCPM4-Survey** is an open-source LLM agent model jointly developed by [THUNLP](https://nlp.csai.tsinghua.edu.cn), Renmin University of China and [ModelBest](https://modelbest.cn/en). Built on [MiniCPM4](https://github.com/OpenBMB/MiniCPM4) with 8 billion parameters, it accepts users' quiries as input and autonomously generate trustworthy, long-form survey papers.
45
+
46
+ Key features include:
47
+
48
+ - **Plan-Retrieve-Write Survey Generation Framework** — We propose a multi-agent generation framework, which operates through three core stages: planning (defining the overall structure of the survey), retrieval (generating appropriate retrieval keywords), and writing (synthesizing the retrieved information to generate coherent section-level content).
49
+
50
+ - **High-Quality Dataset Construction** — We gather and process lots of expert-written survey papers to construct a high-quality training dataset. Meanwhile, we collect a large number of research papers to build a retrieval database.
51
+
52
+ - **Multi-Aspect Reward Design** — We carefully design a reward system with three aspects (structure, content, and citations) to evaluate the quality of the surveys, which is used as the reward function in the RL training stage.
53
+
54
+ - **Multi-Step RL Training Strategy** — We propose a *Context Manager* to ensure retention of essential information while facilitating efficient reasoning, and we construct *Parallel Environment* to maintain efficient RL training cycles.
55
+
56
+
57
+ ## Quick Start
58
+
59
+ ### Download the model
60
+
61
+ Download [MiniCPM4-Survey](https://huggingface.co/openbmb/MiniCPM4-Survey) from Hugging Face and place it in `model/MiniCPM4-Survey`.
62
+ We recommend using [MiniCPM-Embedding-Light](https://huggingface.co/openbmb/MiniCPM-Embedding-Light) as the embedding model, which can be downloaded from Hugging Face and placed in `model/MiniCPM-Embedding-Light`.
63
+ ### Perpare the environment
64
+
65
+ You can download the [paper data](https://www.kaggle.com/datasets/Cornell-University/arxiv) from Kaggle, then extract it. You can run `python data_process.py` to process the data and generate the retrieval database. Then you can run `python build_index.py` to build the retrieval database.
66
+
67
+ ```
68
+ cd ./code
69
+ curl -L -o ~/Downloads/arxiv.zip\
70
+ https://www.kaggle.com/api/v1/datasets/download/Cornell-University/arxiv
71
+ unzip ~/Downloads/arxiv.zip -d .
72
+ mkdir data
73
+ python ./src/preprocess/data_process.py
74
+ mkdir index
75
+ python ./src/preprocess/build_index.py
76
+ ```
77
+
78
+ ### Model Inference
79
+
80
+ You can run the following command to build the retrieval environment and start the inference:
81
+
82
+ ```bash
83
+ cd ./code
84
+ python ./src/retriever.py
85
+ bash ./scripts/run.sh
86
+ ```
87
+
88
+ If you want to run with the frontend, you can run the following command:
89
+
90
+ ```bash
91
+ cd ./code
92
+ python ./src/retriever.py
93
+ bash ./scripts/run_with_frontend.sh
94
+ cd frontend/minicpm4-survey
95
+ npm install
96
+ npm run dev
97
+ ```
98
+
99
+ Then you can visit `http://localhost:5173` in your browser to use the model.
100
+
101
+ ## Performance Evaluation
102
+
103
+ | Method | Relevance | Coverage | Depth | Novelty | Avg. | Fact Score |
104
+ |---------------------------------------------|-----------|----------|-------|---------|-------|------------|
105
+ | Naive RAG (driven by G2FT) | 3.25 | 2.95 | 3.35 | 2.60 | 3.04 | 43.68 |
106
+ | AutoSurvey (driven by G2FT) | 3.10 | 3.25 | 3.15 | **3.15**| 3.16 | 46.56 |
107
+ | Webthinker (driven by WTR1-7B) | 3.30 | 3.00 | 2.75 | 2.50 | 2.89 | -- |
108
+ | Webthinker (driven by QwQ-32B) | 3.40 | 3.30 | 3.30 | 2.50 | 3.13 | -- |
109
+ | OpenAI Deep Research (driven by GPT-4o) | 3.50 |**3.95** | 3.55 | 3.00 | **3.50** | -- |
110
+ | MiniCPM4-Survey | 3.45 | 3.70 | **3.85** | 3.00 | **3.50** | **68.73** |
111
+ | &nbsp;&nbsp;&nbsp;*w/o* RL | **3.55** | 3.35 | 3.30 | 2.25 | 3.11 | 50.24 |
112
+
113
+ *Performance comparison of the survey generation systems. "G2FT" stands for Gemini-2.0-Flash-Thinking, and "WTR1-7B" denotes Webthinker-R1-7B. FactScore evaluation was omitted for Webthinker, as it does not include citation functionality, and for OpenAI Deep Research, which does not provide citations when exporting the results.*
114
+
115
+ ## Statement
116
+ - As a language model, MiniCPM generates content by learning from a vast amount of text.
117
+ - However, it does not possess the ability to comprehend or express personal opinions or value judgments.
118
+ - Any content generated by MiniCPM does not represent the viewpoints or positions of the model developers.
119
+ - Therefore, when using content generated by MiniCPM, users should take full responsibility for evaluating and verifying it on their own.
120
+
121
+ ## LICENSE
122
+ - This repository and MiniCPM models are released under the [Apache-2.0](https://github.com/OpenBMB/MiniCPM/blob/main/LICENSE) License.
123
+
124
+ ## Citation
125
+ - Please cite our [paper](https://github.com/OpenBMB/MiniCPM/tree/main/report/MiniCPM_4_Technical_Report.pdf) if you find our work valuable.
126
+
127
+ ```bibtex
128
+ @article{minicpm4,
129
+ title={{MiniCPM4}: Ultra-Efficient LLMs on End Devices},
130
+ author={MiniCPM Team},
131
+ year={2025}
132
+ }
133
+ ```
134
+
135
+ # 中文
136
+ ## News
137
+
138
+ * [2025-06-05] 🚀🚀🚀我们开源了基于MiniCPM4-8B构建的MiniCPM4-Survey,能够生成可信的长篇调查报告,性能比肩更大模型。
139
+
140
+ ## 概览
141
+
142
+ MiniCPM4-Survey是由[THUNLP](https://nlp.csai.tsinghua.edu.cn)、中国人民大学和[ModelBest](https://modelbest.cn)联合开发的开源大语言模型智能体。它基于[MiniCPM4](https://github.com/OpenBMB/MiniCPM4) 80亿参数基座模型,接受用户质量作为输入,自主生成可信的长篇综述论文。
143
+
144
+ 主要特性包括:
145
+ - 计划-检索-写作生成框架 — 我们提出了一个多智能体生成框架,包含三个核心阶段:计划(定义综述的整体结构)、检索(生成合适的检索关键词)和写作(利用检索到的信息,生成连贯的段落)。
146
+ - 高质量数据集构建——我们收集并处理大量人类专家写作的综述论文,构建高质量训练集。同时,我们收集大量研究论文,构建检索数据库。
147
+ - 多方面奖励设计 — 我们精心设计了包含结构、内容和引用的奖励,用于评估综述的质量,在强化学习训练阶段作奖励函数。
148
+ - 多步强化学习训练策略 — 我们提出了一个上下文管理器,以确保在促进有效推理的同时保留必要的信息,并构建了并行环境,维持强化学习训练高效。
149
+
150
+
151
+ ## 使用
152
+
153
+ ### 下载模型
154
+ Hugging Face 下载[MiniCPM4-Survey](https://huggingface.co/openbmb/MiniCPM4-Survey)并将其放在model/MiniCPM4-Survey中。
155
+ 我们建议使用[MiniCPM-Embedding-Light](https://huggingface.co/openbmb/MiniCPM-Embedding-Light)作为表征模型,放在model/MiniCPM-Embedding-Light中。
156
+
157
+ ### 准备环境
158
+ 从 Kaggle 下载论文数据,然后解压。运行`python data_process.py`,处理数据并生成检索数据库。然后运行`python build_index.py`,构建检索数据库。
159
+ ``` bash
160
+ cd ./code
161
+ curl -L -o ~/Downloads/arxiv.zip\
162
+ https://www.kaggle.com/api/v1/datasets/download/Cornell-University/arxiv
163
+ unzip ~/Downloads/arxiv.zip -d .
164
+ mkdir data
165
+ python ./src/preprocess/data_process.py
166
+ mkdir index
167
+ python ./src/preprocess/build_index.py
168
+ ```
169
+
170
+ ### 模型推理
171
+ 运行以下命令来构建检索环境并开始推理:
172
+ ``` bash
173
+ cd ./code
174
+ python ./src/retriever.py
175
+ bash ./scripts/run.sh
176
+ ```
177
+ 如果您想使用前端运行,可以运行以下命令:
178
+ ``` bash
179
+ cd ./code
180
+ python ./src/retriever.py
181
+ bash ./scripts/run_with_frontend.sh
182
+ cd frontend/minicpm4-survey
183
+ npm install
184
+ npm run dev
185
+ ```
186
+ 然后你可以在浏览器中访问`http://localhost:5173`使用。
187
+
188
+ ## 性能
189
+
190
+ | Method | Relevance | Coverage | Depth | Novelty | Avg. | Fact Score |
191
+ |---------------------------------------------|-----------|----------|-------|---------|-------|------------|
192
+ | Naive RAG (driven by G2FT) | 3.25 | 2.95 | 3.35 | 2.60 | 3.04 | 43.68 |
193
+ | AutoSurvey (driven by G2FT) | 3.10 | 3.25 | 3.15 | **3.15**| 3.16 | 46.56 |
194
+ | Webthinker (driven by WTR1-7B) | 3.30 | 3.00 | 2.75 | 2.50 | 2.89 | -- |
195
+ | Webthinker (driven by QwQ-32B) | 3.40 | 3.30 | 3.30 | 2.50 | 3.13 | -- |
196
+ | OpenAI Deep Research (driven by GPT-4o) | 3.50 |**3.95** | 3.55 | 3.00 | **3.50** | -- |
197
+ | MiniCPM4-Survey | 3.45 | 3.70 | **3.85** | 3.00 | **3.50** | **68.73** |
198
+ | &nbsp;&nbsp;&nbsp;*w/o* RL | **3.55** | 3.35 | 3.30 | 2.25 | 3.11 | 50.24 |
199
+
200
+ *GPT-4o对综述生成系统的性能比较。“G2FT”代表Gemini-2.0-Flash-Thinking,“WTR1-7B”代表Webthinker-R1-7B。由于Webthinker不包括引用功能,OpenAI Deep Research在导出结果时不提供引用,因此省略了对它们的FactScore评估。我们的技术报告中包含评测的详细信息。*
201
+
202
+ # File information
203
+
204
+ The repository contains the following file information:
205
+
206
+ Filename: generation_config.json
207
+ Content: {
208
+ "bos_token_id": 1,
209
+ "do_sample": true,
210
+ "eos_token_id": [
211
+ 2,
212
+ 73440
213
+ ],
214
+ "pad_token_id": 2,
215
+ "temperature": 0.8,
216
+ "top_p": 0.8,
217
+ "transformers_version": "4.46.1"
218
+ }
219
+
220
+ Filename: config.json
221
+ Content: {
222
+ "_name_or_path": "openbmb/MiniCPM4-8B",
223
+ "architectures": [
224
+ "MiniCPMForCausalLM"
225
+ ],
226
+ "auto_map": {
227
+ "AutoConfig": "configuration_minicpm.MiniCPMConfig",
228
+ "AutoModel": "modeling_minicpm.MiniCPMModel",
229
+ "AutoModelForCausalLM": "modeling_minicpm.MiniCPMForCausalLM",
230
+ "AutoModelForSeq2SeqLM": "modeling_minicpm.MiniCPMForCausalLM",
231
+ "AutoModelForSequenceClassification": "modeling_minicpm.MiniCPMForSequenceClassification"
232
+ },
233
+ "bos_token_id": 1,
234
+ "eos_token_id": [
235
+ 2,
236
+ 73440
237
+ ],
238
+ "pad_token_id": 2,
239
+ "hidden_act": "silu",
240
+ "hidden_size": 4096,
241
+ "initializer_range": 0.1,
242
+ "intermediate_size": 16384,
243
+ "max_position_embeddings": 32768,
244
+ "model_type": "minicpm",
245
+ "num_attention_heads": 32,
246
+ "num_hidden_layers": 32,
247
+ "num_key_value_heads": 2,
248
+ "rms_norm_eps": 1e-06,
249
+ "rope_scaling": {
250
+ "rope_type": "longrope",
251
+ "long_factor": [
252
+ 0.9977997200264581,
253
+ 1.014658295992452,
254
+ 1.0349680404997148,
255
+ 1.059429246056193,
256
+ 1.0888815016813513,
257
+ 1.1243301355211495,
258
+ 1.166977103606075,
259
+ 1.2182568066927284,
260
+ 1.2798772354275727,
261
+ 1.3538666751582975,
262
+ 1.4426259039919596,
263
+ 1.5489853358570191,
264
+ 1.6762658237220625,
265
+ 1.8283407612492941,
266
+ 2.0096956085876183,
267
+ 2.225478927469756,
268
+ 2.481536379650452,
269
+ 2.784415934557119,
270
+ 3.1413289096347365,
271
+ 3.560047844772632,
272
+ 4.048719380066383,
273
+ 4.615569542115128,
274
+ 5.2684819496549835,
275
+ 6.014438591970396,
276
+ 6.858830049237097,
277
+ 7.804668263503327,
278
+ 8.851768731513417,
279
+ 9.99600492938444,
280
+ 11.228766118181639,
281
+ 12.536757560834843,
282
+ 13.902257701387796,
283
+ 15.303885189125953,
284
+ 16.717837610115794,
285
+ 18.119465097853947,
286
+ 19.484965238406907,
287
+ 20.792956681060105,
288
+ 22.02571786985731,
289
+ 23.16995406772833,
290
+ 24.217054535738416,
291
+ 25.16289275000465,
292
+ 26.007284207271347,
293
+ 26.753240849586767,
294
+ 27.40615325712662,
295
+ 27.973003419175363,
296
+ 28.461674954469114,
297
+ 28.880393889607006,
298
+ 29.237306864684626,
299
+ 29.540186419591297,
300
+ 29.79624387177199,
301
+ 30.01202719065413,
302
+ 30.193382037992453,
303
+ 30.34545697551969,
304
+ 30.47273746338473,
305
+ 30.579096895249787,
306
+ 30.66785612408345,
307
+ 30.741845563814174,
308
+ 30.80346599254902,
309
+ 30.85474569563567,
310
+ 30.897392663720595,
311
+ 30.932841297560394,
312
+ 30.962293553185553,
313
+ 30.986754758742034,
314
+ 31.007064503249293,
315
+ 31.02392307921529
316
+ ],
317
+ "short_factor": [
318
+ 0.9977997200264581,
319
+ 1.014658295992452,
320
+ 1.0349680404997148,
321
+ 1.059429246056193,
322
+ 1.0888815016813513,
323
+ 1.1243301355211495,
324
+ 1.166977103606075,
325
+ 1.2182568066927284,
326
+ 1.2798772354275727,
327
+ 1.3538666751582975,
328
+ 1.4426259039919596,
329
+ 1.5489853358570191,
330
+ 1.6762658237220625,
331
+ 1.8283407612492941,
332
+ 2.0096956085876183,
333
+ 2.225478927469756,
334
+ 2.481536379650452,
335
+ 2.784415934557119,
336
+ 3.1413289096347365,
337
+ 3.560047844772632,
338
+ 4.048719380066383,
339
+ 4.615569542115128,
340
+ 5.2684819496549835,
341
+ 6.014438591970396,
342
+ 6.858830049237097,
343
+ 7.804668263503327,
344
+ 8.851768731513417,
345
+ 9.99600492938444,
346
+ 11.228766118181639,
347
+ 12.536757560834843,
348
+ 13.902257701387796,
349
+ 15.303885189125953,
350
+ 16.717837610115794,
351
+ 18.119465097853947,
352
+ 19.484965238406907,
353
+ 20.792956681060105,
354
+ 22.02571786985731,
355
+ 23.16995406772833,
356
+ 24.217054535738416,
357
+ 25.16289275000465,
358
+ 26.007284207271347,
359
+ 26.753240849586767,
360
+ 27.40615325712662,
361
+ 27.973003419175363,
362
+ 28.461674954469114,
363
+ 28.880393889607006,
364
+ 29.237306864684626,
365
+ 29.540186419591297,
366
+ 29.79624387177199,
367
+ 30.01202719065413,
368
+ 30.193382037992453,
369
+ 30.34545697551969,
370
+ 30.47273746338473,
371
+ 30.579096895249787,
372
+ 30.66785612408345,
373
+ 30.741845563814174,
374
+ 30.80346599254902,
375
+ 30.85474569563567,
376
+ 30.897392663720595,
377
+ 30.932841297560394,
378
+ 30.962293553185553,
379
+ 30.986754758742034,
380
+ 31.007064503249293,
381
+ 31.02392307921529
382
+ ],
383
+ "original_max_position_embeddings": 32768
384
+ }
385
+ }
386
+
387
+ Filename: added_tokens.json
388
+ Content: {
389
+ "<|execute_end|>": 73444,
390
+ "<|execute_start|>": 73443,
391
+ "<|fim_middle|>": 73446,
392
+ "<|fim_prefix|>": 73445,
393
+ "<|fim_suffix|>": 73447,
394
+ "<|im_end|>": 73440,
395
+ "<|im_start|>": 73441,
396
+ "<|tool_call|>": 73442
397
+ }
398
+
399
+ Filename: special_tokens_map.json
400
+ Content: {
401
+ "additional_special_tokens": [
402
+ "<|im_end|>",
403
+ "<|im_start|>",
404
+ "<|tool_call|>",
405
+ "<|execute_start|>",
406
+ "<|execute_end|>",
407
+ "<|fim_prefix|>",
408
+ "<|fim_middle|>",
409
+ "<|fim_suffix|>"
410
+ ],
411
+ "bos_token": {
412
+ "content": "<s>",
413
+ "lstrip": false,
414
+ "normalized": false,
415
+ "rstrip": false,
416
+ "single_word": false
417
+ },
418
+ "eos_token": {
419
+ "content": "<|im_end|>",
420
+ "lstrip": false,
421
+ "normalized": false,
422
+ "rstrip": false,
423
+ "single_word": false
424
+ },
425
+ "unk_token": {
426
+ "content": "<unk>",
427
+ "lstrip": false,
428
+ "normalized": false,
429
+ "rstrip": false,
430
+ "single_word": false
431
+ }
432
+ }
433
+
434
+ Filename: model.safetensors.index.json
435
+ Content: Content of the file is larger than 50 KB, too long to display.
436
+
437
+ Filename: tokenizer.json
438
+ Content: Content of the file is larger than 50 KB, too long to display.
439
+
440
+ Filename: tokenizer_config.json
441
+ Content: {
442
+ "add_bos_token": true,
443
+ "add_eos_token": false,
444
+ "add_prefix_space": null,
445
+ "added_tokens_decoder": {
446
+ "0": {
447
+ "content": "<unk>",
448
+ "lstrip": false,
449
+ "normalized": false,
450
+ "rstrip": false,
451
+ "single_word": false,
452
+ "special": true
453
+ },
454
+ "1": {
455
+ "content": "<s>",
456
+ "lstrip": false,
457
+ "normalized": false,
458
+ "rstrip": false,
459
+ "single_word": false,
460
+ "special": true
461
+ },
462
+ "2": {
463
+ "content": "</s>",
464
+ "lstrip": false,
465
+ "normalized": false,
466
+ "rstrip": false,
467
+ "single_word": false,
468
+ "special": true
469
+ },
470
+ "73440": {
471
+ "content": "<|im_end|>",
472
+ "lstrip": false,
473
+ "normalized": false,
474
+ "rstrip": false,
475
+ "single_word": false,
476
+ "special": true
477
+ },
478
+ "73441": {
479
+ "content": "<|im_start|>",
480
+ "lstrip": false,
481
+ "normalized": false,
482
+ "rstrip": false,
483
+ "single_word": false,
484
+ "special": true
485
+ },
486
+ "73442": {
487
+ "content": "<|tool_call|>",
488
+ "lstrip": false,
489
+ "normalized": false,
490
+ "rstrip": false,
491
+ "single_word": false,
492
+ "special": true
493
+ },
494
+ "73443": {
495
+ "content": "<|execute_start|>",
496
+ "lstrip": false,
497
+ "normalized": false,
498
+ "rstrip": false,
499
+ "single_word": false,
500
+ "special": true
501
+ },
502
+ "73444": {
503
+ "content": "<|execute_end|>",
504
+ "lstrip": false,
505
+ "normalized": false,
506
+ "rstrip": false,
507
+ "single_word": false,
508
+ "special": true
509
+ },
510
+ "73445": {
511
+ "content": "<|fim_prefix|>",
512
+ "lstrip": false,
513
+ "normalized": false,
514
+ "rstrip": false,
515
+ "single_word": false,
516
+ "special": true
517
+ },
518
+ "73446": {
519
+ "content": "<|fim_middle|>",
520
+ "lstrip": false,
521
+ "normalized": false,
522
+ "rstrip": false,
523
+ "single_word": false,
524
+ "special": true
525
+ },
526
+ "73447": {
527
+ "content": "<|fim_suffix|>",
528
+ "lstrip": false,
529
+ "normalized": false,
530
+ "rstrip": false,
531
+ "single_word": false,
532
+ "special": true
533
+ }
534
+ },
535
+ "additional_special_tokens": [
536
+ "<|im_end|>",
537
+ "<|im_start|>",
538
+ "<|tool_call