KevinHuSh writinwaters commited on
Commit
a68b28a
·
1 Parent(s): 3d90fa3

update docs for release 0.8.0 (#1419)

Browse files

### What problem does this PR solve?

update docs for release 0.8.0

### Type of change

- [x] Documentation Update

---------

Co-authored-by: writinwaters <[email protected]>

README.md CHANGED
@@ -17,7 +17,7 @@
17
  <a href="https://demo.ragflow.io" target="_blank">
18
  <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99"></a>
19
  <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
20
- <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.7.0-brightgreen" alt="docker pull infiniflow/ragflow:v0.7.0"></a>
21
  <a href="https://github.com/infiniflow/ragflow/blob/main/LICENSE">
22
  <img height="21" src="https://img.shields.io/badge/License-Apache--2.0-ffffff?labelColor=d4eaf7&color=2e6cc4" alt="license">
23
  </a>
@@ -64,16 +64,16 @@ Try our demo at [https://demo.ragflow.io](https://demo.ragflow.io).
64
 
65
  ## 📌 Latest Updates
66
 
 
 
67
  - 2024-06-27 Supports Markdown and Docx in the Q&A parsing method. Supports extracting images from Docx files. Supports extracting tables from Markdown files.
68
  - 2024-06-14 Supports PDF in the Q&A parsing method.
69
-
70
  - 2024-06-06 Supports [Self-RAG](https://huggingface.co/papers/2310.11511), which is enabled by default in dialog settings.
71
  - 2024-05-30 Integrates [BCE](https://github.com/netease-youdao/BCEmbedding) and [BGE](https://github.com/FlagOpen/FlagEmbedding) reranker models.
72
  - 2024-05-28 Supports LLM Baichuan and VolcanoArk.
73
  - 2024-05-23 Supports [RAPTOR](https://arxiv.org/html/2401.18059v1) for better text retrieval.
74
  - 2024-05-21 Supports streaming output and text chunk retrieval API.
75
  - 2024-05-15 Integrates OpenAI GPT-4o.
76
- - 2024-05-08 Integrates LLM DeepSeek-V2.
77
 
78
  ## 🌟 Key Features
79
 
@@ -150,7 +150,7 @@ Try our demo at [https://demo.ragflow.io](https://demo.ragflow.io).
150
 
151
  3. Build the pre-built Docker images and start up the server:
152
 
153
- > Running the following commands automatically downloads the *dev* version RAGFlow Docker image. To download and run a specified Docker version, update `RAGFLOW_VERSION` in **docker/.env** to the intended version, for example `RAGFLOW_VERSION=v0.7.0`, before running the following commands.
154
 
155
  ```bash
156
  $ cd ragflow/docker
 
17
  <a href="https://demo.ragflow.io" target="_blank">
18
  <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99"></a>
19
  <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
20
+ <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.8.0-brightgreen" alt="docker pull infiniflow/ragflow:v0.8.0"></a>
21
  <a href="https://github.com/infiniflow/ragflow/blob/main/LICENSE">
22
  <img height="21" src="https://img.shields.io/badge/License-Apache--2.0-ffffff?labelColor=d4eaf7&color=2e6cc4" alt="license">
23
  </a>
 
64
 
65
  ## 📌 Latest Updates
66
 
67
+ - 2024-07-08 Supports [Graph](./graph/README.md).
68
+
69
  - 2024-06-27 Supports Markdown and Docx in the Q&A parsing method. Supports extracting images from Docx files. Supports extracting tables from Markdown files.
70
  - 2024-06-14 Supports PDF in the Q&A parsing method.
 
71
  - 2024-06-06 Supports [Self-RAG](https://huggingface.co/papers/2310.11511), which is enabled by default in dialog settings.
72
  - 2024-05-30 Integrates [BCE](https://github.com/netease-youdao/BCEmbedding) and [BGE](https://github.com/FlagOpen/FlagEmbedding) reranker models.
73
  - 2024-05-28 Supports LLM Baichuan and VolcanoArk.
74
  - 2024-05-23 Supports [RAPTOR](https://arxiv.org/html/2401.18059v1) for better text retrieval.
75
  - 2024-05-21 Supports streaming output and text chunk retrieval API.
76
  - 2024-05-15 Integrates OpenAI GPT-4o.
 
77
 
78
  ## 🌟 Key Features
79
 
 
150
 
151
  3. Build the pre-built Docker images and start up the server:
152
 
153
+ > Running the following commands automatically downloads the *dev* version RAGFlow Docker image. To download and run a specified Docker version, update `RAGFLOW_VERSION` in **docker/.env** to the intended version, for example `RAGFLOW_VERSION=v0.8.0`, before running the following commands.
154
 
155
  ```bash
156
  $ cd ragflow/docker
README_ja.md CHANGED
@@ -17,8 +17,8 @@
17
  <a href="https://demo.ragflow.io" target="_blank">
18
  <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99"></a>
19
  <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
20
- <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.7.0-brightgreen"
21
- alt="docker pull infiniflow/ragflow:v0.7.0"></a>
22
  <a href="https://github.com/infiniflow/ragflow/blob/main/LICENSE">
23
  <img height="21" src="https://img.shields.io/badge/License-Apache--2.0-ffffff?labelColor=d4eaf7&color=2e6cc4" alt="license">
24
  </a>
@@ -45,15 +45,15 @@
45
 
46
 
47
  ## 📌 最新情報
 
48
  - 2024-06-27 Q&A解析方式はMarkdownファイルとDocxファイルをサポートしています。Docxファイルからの画像の抽出をサポートします。Markdownファイルからテーブルを抽出することをサポートします。
49
  - 2024-06-14 Q&A 解析メソッドは PDF ファイルをサポートしています。
50
  - 2024-06-06 会話設定でデフォルトでチェックされている [Self-RAG](https://huggingface.co/papers/2310.11511) をサポートします。
51
- - 2024-05-30 [BCE](https://github.com/netease-youdao/BCEmbedding)、[BGE](https://github.com/FlagOpen/FlagEmbedding) reranker を統合。
52
  - 2024-05-28 LLM BaichuanとVolcanoArkを統合しました。
53
- - 2024-05-23 より良いテキスト検索のために[RAPTOR](https://arxiv.org/html/2401.18059v1)をサポート。
54
  - 2024-05-21 ストリーミング出力とテキストチャンク取得APIをサポート。
55
  - 2024-05-15 OpenAI GPT-4oを統合しました。
56
- - 2024-05-08 LLM DeepSeek-V2を統合しました。
57
 
58
  ## 🌟 主な特徴
59
 
@@ -136,7 +136,7 @@
136
  $ docker compose up -d
137
  ```
138
 
139
- > 上記のコマンドを実行すると、RAGFlowの開発版dockerイメージが自動的にダウンロードされます。 特定のバージョンのDockerイメージをダウンロードして実行したい場合は、docker/.envファイルのRAGFLOW_VERSION変数を見つけて、対応するバージョンに変更してください。 例えば、RAGFLOW_VERSION=v0.7.0として、上記のコマンドを実行してください。
140
 
141
  > コアイメージのサイズは約 9 GB で、ロードに時間がかかる場合があります。
142
 
@@ -198,7 +198,7 @@
198
  ```bash
199
  $ git clone https://github.com/infiniflow/ragflow.git
200
  $ cd ragflow/
201
- $ docker build -t infiniflow/ragflow:v0.7.0 .
202
  $ cd ragflow/docker
203
  $ chmod +x ./entrypoint.sh
204
  $ docker compose up -d
 
17
  <a href="https://demo.ragflow.io" target="_blank">
18
  <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99"></a>
19
  <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
20
+ <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.8.0-brightgreen"
21
+ alt="docker pull infiniflow/ragflow:v0.8.0"></a>
22
  <a href="https://github.com/infiniflow/ragflow/blob/main/LICENSE">
23
  <img height="21" src="https://img.shields.io/badge/License-Apache--2.0-ffffff?labelColor=d4eaf7&color=2e6cc4" alt="license">
24
  </a>
 
45
 
46
 
47
  ## 📌 最新情報
48
+ - 2024-07-08 [Graph](./graph/README.md) に対応しました。.
49
  - 2024-06-27 Q&A解析方式はMarkdownファイルとDocxファイルをサポートしています。Docxファイルからの画像の抽出をサポートします。Markdownファイルからテーブルを抽出することをサポートします。
50
  - 2024-06-14 Q&A 解析メソッドは PDF ファイルをサポートしています。
51
  - 2024-06-06 会話設定でデフォルトでチェックされている [Self-RAG](https://huggingface.co/papers/2310.11511) をサポートします。
52
+ - 2024-05-30 [BCE](https://github.com/netease-youdao/BCEmbedding) 、[BGE](https://github.com/FlagOpen/FlagEmbedding) reranker を統合。
53
  - 2024-05-28 LLM BaichuanとVolcanoArkを統合しました。
54
+ - 2024-05-23 より良いテキスト検索のために [RAPTOR](https://arxiv.org/html/2401.18059v1) をサポート。
55
  - 2024-05-21 ストリーミング出力とテキストチャンク取得APIをサポート。
56
  - 2024-05-15 OpenAI GPT-4oを統合しました。
 
57
 
58
  ## 🌟 主な特徴
59
 
 
136
  $ docker compose up -d
137
  ```
138
 
139
+ > 上記のコマンドを実行すると、RAGFlowの開発版dockerイメージが自動的にダウンロードされます。 特定のバージョンのDockerイメージをダウンロードして実行したい場合は、docker/.envファイルのRAGFLOW_VERSION変数を見つけて、対応するバージョンに変更してください。 例えば、RAGFLOW_VERSION=v0.8.0として、上記のコマンドを実行してください。
140
 
141
  > コアイメージのサイズは約 9 GB で、ロードに時間がかかる場合があります。
142
 
 
198
  ```bash
199
  $ git clone https://github.com/infiniflow/ragflow.git
200
  $ cd ragflow/
201
+ $ docker build -t infiniflow/ragflow:v0.8.0 .
202
  $ cd ragflow/docker
203
  $ chmod +x ./entrypoint.sh
204
  $ docker compose up -d
README_zh.md CHANGED
@@ -17,7 +17,7 @@
17
  <a href="https://demo.ragflow.io" target="_blank">
18
  <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99"></a>
19
  <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
20
- <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.7.0-brightgreen" alt="docker pull infiniflow/ragflow:v0.7.0"></a>
21
  <a href="https://github.com/infiniflow/ragflow/blob/main/LICENSE">
22
  <img height="21" src="https://img.shields.io/badge/License-Apache--2.0-ffffff?labelColor=d4eaf7&color=2e6cc4" alt="license">
23
  </a>
@@ -45,6 +45,7 @@
45
 
46
  ## 📌 近期更新
47
 
 
48
  - 2024-06-27 Q&A 解析方式支持 Markdown 文件和 Docx 文件。支持提取出 Docx 文件中的图片。支持提取出 Markdown 文件中的表格。
49
  - 2024-06-14 Q&A 解析方式支持 PDF 文件。
50
  - 2024-06-06 支持 [Self-RAG](https://huggingface.co/papers/2310.11511) ,在对话设置里面默认勾选。
@@ -53,7 +54,6 @@
53
  - 2024-05-23 实现 [RAPTOR](https://arxiv.org/html/2401.18059v1) 提供更好的文本检索。
54
  - 2024-05-21 支持流式结果输出和文本块获取API。
55
  - 2024-05-15 集成大模型 OpenAI GPT-4o。
56
- - 2024-05-08 集成大模型 DeepSeek。
57
 
58
  ## 🌟 主要功能
59
 
@@ -136,7 +136,7 @@
136
  $ docker compose -f docker-compose-CN.yml up -d
137
  ```
138
 
139
- > 请注意,运行上述命令会自动下载 RAGFlow 的开发版本 docker 镜像。如果你想下载并运行特定版本的 docker 镜像,请在 docker/.env 文件中找到 RAGFLOW_VERSION 变量,将其改为对应版本。例如 RAGFLOW_VERSION=v0.7.0,然后运行上述命令。
140
 
141
  > 核心镜像文件大约 9 GB,可能需要一定时间拉取。请耐心等待。
142
 
@@ -198,7 +198,7 @@
198
  ```bash
199
  $ git clone https://github.com/infiniflow/ragflow.git
200
  $ cd ragflow/
201
- $ docker build -t infiniflow/ragflow:v0.7.0 .
202
  $ cd ragflow/docker
203
  $ chmod +x ./entrypoint.sh
204
  $ docker compose up -d
 
17
  <a href="https://demo.ragflow.io" target="_blank">
18
  <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99"></a>
19
  <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
20
+ <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.8.0-brightgreen" alt="docker pull infiniflow/ragflow:v0.8.0"></a>
21
  <a href="https://github.com/infiniflow/ragflow/blob/main/LICENSE">
22
  <img height="21" src="https://img.shields.io/badge/License-Apache--2.0-ffffff?labelColor=d4eaf7&color=2e6cc4" alt="license">
23
  </a>
 
45
 
46
  ## 📌 近期更新
47
 
48
+ - 2024-07-08 支持 [Graph](./graph/README.md)。
49
  - 2024-06-27 Q&A 解析方式支持 Markdown 文件和 Docx 文件。支持提取出 Docx 文件中的图片。支持提取出 Markdown 文件中的表格。
50
  - 2024-06-14 Q&A 解析方式支持 PDF 文件。
51
  - 2024-06-06 支持 [Self-RAG](https://huggingface.co/papers/2310.11511) ,在对话设置里面默认勾选。
 
54
  - 2024-05-23 实现 [RAPTOR](https://arxiv.org/html/2401.18059v1) 提供更好的文本检索。
55
  - 2024-05-21 支持流式结果输出和文本块获取API。
56
  - 2024-05-15 集成大模型 OpenAI GPT-4o。
 
57
 
58
  ## 🌟 主要功能
59
 
 
136
  $ docker compose -f docker-compose-CN.yml up -d
137
  ```
138
 
139
+ > 请注意,运行上述命令会自动下载 RAGFlow 的开发版本 docker 镜像。如果你想下载并运行特定版本的 docker 镜像,请在 docker/.env 文件中找到 RAGFLOW_VERSION 变量,将其改为对应版本。例如 RAGFLOW_VERSION=v0.8.0,然后运行上述命令。
140
 
141
  > 核心镜像文件大约 9 GB,可能需要一定时间拉取。请耐心等待。
142
 
 
198
  ```bash
199
  $ git clone https://github.com/infiniflow/ragflow.git
200
  $ cd ragflow/
201
+ $ docker build -t infiniflow/ragflow:v0.8.0 .
202
  $ cd ragflow/docker
203
  $ chmod +x ./entrypoint.sh
204
  $ docker compose up -d
docs/guides/configure_knowledge_base.md CHANGED
@@ -124,7 +124,7 @@ RAGFlow uses multiple recall of both full-text search and vector search in its c
124
 
125
  ## Search for knowledge base
126
 
127
- As of RAGFlow v0.7.0, the search feature is still in a rudimentary form, supporting only knowledge base search by name.
128
 
129
  ![search knowledge base](https://github.com/infiniflow/ragflow/assets/93570324/836ae94c-2438-42be-879e-c7ad2a59693e)
130
 
 
124
 
125
  ## Search for knowledge base
126
 
127
+ As of RAGFlow v0.8.0, the search feature is still in a rudimentary form, supporting only knowledge base search by name.
128
 
129
  ![search knowledge base](https://github.com/infiniflow/ragflow/assets/93570324/836ae94c-2438-42be-879e-c7ad2a59693e)
130
 
docs/guides/manage_files.md CHANGED
@@ -45,11 +45,11 @@ You can link your file to one knowledge base or multiple knowledge bases at one
45
 
46
  ## Move file to specified folder
47
 
48
- As of RAGFlow v0.7.0, this feature is *not* available.
49
 
50
  ## Search files or folders
51
 
52
- As of RAGFlow v0.7.0, the search feature is still in a rudimentary form, supporting only file and folder search in the current directory by name (files or folders in the child directory will not be retrieved).
53
 
54
  ![search file](https://github.com/infiniflow/ragflow/assets/93570324/77ffc2e5-bd80-4ed1-841f-068e664efffe)
55
 
@@ -81,4 +81,4 @@ RAGFlow's file management allows you to download an uploaded file:
81
 
82
  ![download_file](https://github.com/infiniflow/ragflow/assets/93570324/cf3b297f-7d9b-4522-bf5f-4f45743e4ed5)
83
 
84
- > As of RAGFlow v0.7.0, bulk download is not supported, nor can you download an entire folder.
 
45
 
46
  ## Move file to specified folder
47
 
48
+ As of RAGFlow v0.8.0, this feature is *not* available.
49
 
50
  ## Search files or folders
51
 
52
+ As of RAGFlow v0.8.0, the search feature is still in a rudimentary form, supporting only file and folder search in the current directory by name (files or folders in the child directory will not be retrieved).
53
 
54
  ![search file](https://github.com/infiniflow/ragflow/assets/93570324/77ffc2e5-bd80-4ed1-841f-068e664efffe)
55
 
 
81
 
82
  ![download_file](https://github.com/infiniflow/ragflow/assets/93570324/cf3b297f-7d9b-4522-bf5f-4f45743e4ed5)
83
 
84
+ > As of RAGFlow v0.8.0, bulk download is not supported, nor can you download an entire folder.
docs/quickstart.mdx CHANGED
@@ -34,7 +34,7 @@ This section provides instructions on setting up the RAGFlow server on Linux. If
34
 
35
  `vm.max_map_count`. This value sets the maximum number of memory map areas a process may have. Its default value is 65530. While most applications require fewer than a thousand maps, reducing this value can result in abmornal behaviors, and the system will throw out-of-memory errors when a process reaches the limitation.
36
 
37
- RAGFlow v0.7.0 uses Elasticsearch for multiple recall. Setting the value of `vm.max_map_count` correctly is crucial to the proper functioning of the Elasticsearch component.
38
 
39
  <Tabs
40
  defaultValue="linux"
@@ -132,7 +132,7 @@ This section provides instructions on setting up the RAGFlow server on Linux. If
132
 
133
  3. Build the pre-built Docker images and start up the server:
134
 
135
- > Running the following commands automatically downloads the *dev* version RAGFlow Docker image. To download and run a specified Docker version, update `RAGFLOW_VERSION` in **docker/.env** to the intended version, for example `RAGFLOW_VERSION=v0.7.0`, before running the following commands.
136
 
137
  ```bash
138
  $ cd ragflow/docker
 
34
 
35
  `vm.max_map_count`. This value sets the maximum number of memory map areas a process may have. Its default value is 65530. While most applications require fewer than a thousand maps, reducing this value can result in abmornal behaviors, and the system will throw out-of-memory errors when a process reaches the limitation.
36
 
37
+ RAGFlow v0.8.0 uses Elasticsearch for multiple recall. Setting the value of `vm.max_map_count` correctly is crucial to the proper functioning of the Elasticsearch component.
38
 
39
  <Tabs
40
  defaultValue="linux"
 
132
 
133
  3. Build the pre-built Docker images and start up the server:
134
 
135
+ > Running the following commands automatically downloads the *dev* version RAGFlow Docker image. To download and run a specified Docker version, update `RAGFLOW_VERSION` in **docker/.env** to the intended version, for example `RAGFLOW_VERSION=v0.8.0`, before running the following commands.
136
 
137
  ```bash
138
  $ cd ragflow/docker
graph/canvas.py CHANGED
@@ -204,7 +204,8 @@ class Canvas(ABC):
204
  cpn = self.get_component(cpn_id)
205
  if not cpn["downstream"]: break
206
 
207
- if self._find_loop(): raise OverflowError("Too much loops!")
 
208
 
209
  if cpn["obj"].component_name.lower() in ["switch", "categorize", "relevant"]:
210
  switch_out = cpn["obj"].output()[1].iloc[0, 0]
@@ -277,15 +278,18 @@ class Canvas(ABC):
277
 
278
  if len(path) < 2: return False
279
 
280
- for l in range(1, len(path) // 2):
281
  pat = ",".join(path[0:l])
282
  path_str = ",".join(path)
283
  if len(pat) >= len(path_str): return False
284
- path_str = path_str[len(pat):]
285
  loop = max_loops
286
- while path_str.find(pat) >= 0 and loop >= 0:
287
  loop -= 1
288
- path_str = path_str[len(pat):]
289
- if loop < 0: return True
 
 
 
 
290
 
291
  return False
 
204
  cpn = self.get_component(cpn_id)
205
  if not cpn["downstream"]: break
206
 
207
+ loop = self._find_loop()
208
+ if loop: raise OverflowError(f"Too much loops: {loop}")
209
 
210
  if cpn["obj"].component_name.lower() in ["switch", "categorize", "relevant"]:
211
  switch_out = cpn["obj"].output()[1].iloc[0, 0]
 
278
 
279
  if len(path) < 2: return False
280
 
281
+ for l in range(2, len(path) // 2):
282
  pat = ",".join(path[0:l])
283
  path_str = ",".join(path)
284
  if len(pat) >= len(path_str): return False
 
285
  loop = max_loops
286
+ while path_str.find(pat) == 0 and loop >= 0:
287
  loop -= 1
288
+ if len(pat)+1 >= len(path_str):
289
+ return False
290
+ path_str = path_str[len(pat)+1:]
291
+ if loop < 0:
292
+ pat = " => ".join([p.split(":")[0] for p in path[0:l]])
293
+ return pat + " => " + pat
294
 
295
  return False
graph/component/categorize.py CHANGED
@@ -38,7 +38,7 @@ class CategorizeParam(GenerateParam):
38
  self.check_empty(self.category_description, "[Categorize] Category examples")
39
  for k, v in self.category_description.items():
40
  if not k: raise ValueError(f"[Categorize] Category name can not be empty!")
41
- if not v["to"]: raise ValueError(f"[Categorize] 'To' of category {k} can not be empty!")
42
 
43
  def get_prompt(self):
44
  cate_lines = []
 
38
  self.check_empty(self.category_description, "[Categorize] Category examples")
39
  for k, v in self.category_description.items():
40
  if not k: raise ValueError(f"[Categorize] Category name can not be empty!")
41
+ if not v.get("to"): raise ValueError(f"[Categorize] 'To' of category {k} can not be empty!")
42
 
43
  def get_prompt(self):
44
  cate_lines = []
graph/component/generate.py CHANGED
@@ -72,7 +72,10 @@ class Generate(ComponentBase):
72
  for para in self._param.parameters:
73
  cpn = self._canvas.get_component(para["component_id"])["obj"]
74
  _, out = cpn.output(allow_partial=False)
75
- kwargs[para["key"]] = "\n - ".join(out["content"])
 
 
 
76
 
77
  kwargs["input"] = input
78
  for n, v in kwargs.items():
 
72
  for para in self._param.parameters:
73
  cpn = self._canvas.get_component(para["component_id"])["obj"]
74
  _, out = cpn.output(allow_partial=False)
75
+ if "content" not in out.columns:
76
+ kwargs[para["key"]] = "Nothing"
77
+ else:
78
+ kwargs[para["key"]] = "\n - ".join(out["content"])
79
 
80
  kwargs["input"] = input
81
  for n, v in kwargs.items():