KevinHuSh commited on
Commit
2fd3125
·
1 Parent(s): f30b544

Refine README (#118)

Browse files

* Refine README

* refine README

* refine README

Files changed (4) hide show
  1. README.md +88 -16
  2. api/apps/llm_app.py +5 -10
  3. api/ragflow_server.py +1 -1
  4. docker/README.md +80 -0
README.md CHANGED
@@ -1,14 +1,64 @@
1
- English | [简体中文](./README_zh.md)
 
 
 
 
2
 
3
 
4
- ## System Environment Preparation
 
 
 
5
 
6
- ### Install docker
 
 
 
 
 
 
 
 
 
7
 
8
- If your machine doesn't have *Docker* installed, please refer to [Install Docker Engine](https://docs.docker.com/engine/install/)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
 
10
- ### OS Setups
11
- Firstly, you need to check the following command:
 
 
 
 
 
 
 
 
 
12
  ```bash
13
  121:/ragflow# sysctl vm.max_map_count
14
  vm.max_map_count = 262144
@@ -24,7 +74,11 @@ Add or update the following line in the file:
24
  vm.max_map_count=262144
25
  ```
26
 
27
- ## Here we go!
 
 
 
 
28
  > If you want to change the basic setups, like port, password .etc., please refer to [.env](./docker/.env) before starting the system.
29
 
30
  > If you change anything in [.env](./docker/.env), please check [service_conf.yaml](./docker/service_conf.yaml) which is a
@@ -37,10 +91,13 @@ vm.max_map_count=262144
37
  > [OpenAI](https://platform.openai.com/login?launch), [通义千问/QWen](https://dashscope.console.aliyun.com/model),
38
  > [智谱AI/ZhipuAI](https://open.bigmodel.cn/)
39
  ```bash
40
- 121:/ragflow# cd docker
 
41
  121:/ragflow/docker# docker compose up -d
42
  ```
43
- If after about a half of minutes, use the following command to check the server status. If you can have the following outputs,
 
 
44
  _**Hallelujah!**_ You have successfully launched the system.
45
  ```bash
46
  121:/ragflow# docker logs -f ragflow-server
@@ -58,10 +115,25 @@ _**Hallelujah!**_ You have successfully launched the system.
58
  INFO:werkzeug:Press CTRL+C to quit
59
 
60
  ```
61
- Open your browser, after entering the IP address of your server, if you see the flowing in your browser, _**Hallelujah**_ again!
62
- > The default serving port is 80, if you want to change that, please refer to [ragflow.conf](./nginx/ragflow.conf),
63
- > and change the *listen* value.
64
-
65
- <div align="center" style="margin-top:20px;margin-bottom:20px;">
66
- <img src="https://github.com/infiniflow/ragflow/assets/12318111/b24a7a5f-4d1d-4a30-90b1-7b0ec558b79d" width="1000"/>
67
- </div>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <div align="center">
2
+ <a href="https://ragflow.io/">
3
+ <img src="https://github.com/infiniflow/ragflow/assets/12318111/f034fb27-b3bf-401b-b213-e1dfa7448d2a" width="320" alt="ragflow logo">
4
+ </a>
5
+ </div>
6
 
7
 
8
+ <p align="center">
9
+ <a href="./README.md">English</a> |
10
+ <a href="./README_zh.md">简体中文</a>
11
+ </p>
12
 
13
+ <p align="center">
14
+ <a href="https://ragflow.io" target="_blank">
15
+ <img alt="Static Badge" src="https://img.shields.io/badge/RAGFLOW-LLM-white?&labelColor=dd0af7"></a>
16
+ <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
17
+ <img src="https://img.shields.io/badge/docker_pull-ragflow:v1.0-brightgreen"
18
+ alt="docker pull ragflow:v1.0"></a>
19
+ <a href="https://github.com/infiniflow/ragflow/blob/main/LICENSE">
20
+ <img height="21" src="https://img.shields.io/badge/License-Apache--2.0-ffffff?style=flat-square&labelColor=d4eaf7&color=7d09f1" alt="license">
21
+ </a>
22
+ </p>
23
 
24
+ [RAGFLOW](http://ragflow.io) is a knowledge management platform built on custom-build document understanding engine and LLM,
25
+ with reasoned and well-founded answers to your question. Clone this repository, you can deploy your own knowledge management
26
+ platform to empower your business with AI.
27
+
28
+ <div align="center" style="margin-top:20px;margin-bottom:20px;">
29
+ <img src="https://github.com/infiniflow/ragflow/assets/12318111/b24a7a5f-4d1d-4a30-90b1-7b0ec558b79d" width="1000"/>
30
+ </div>
31
+
32
+ # Features
33
+ - **Custom-build document understanding engine.** Our deep learning engine is made according to the needs of analyzing and searching various type of documents in different domain.
34
+ - For documents from different domain for different purpose, the engine applys different analyzing and search strategy.
35
+ - Easily intervene and manipulate the data proccessing procedure when things goes beyond expectation.
36
+ - Multi-media document understanding is supported using OCR and multi-modal LLM.
37
+ - **State-of-the-art table structure and layout recognition.** Precisely extract and understand the document including table content. [README](./deepdoc/README.md)
38
+ - For PDF files, layout and table structures including row, column and span of them are recognized.
39
+ - Put the table accrossing the pages together.
40
+ - Reconstruct the table structure components into html table.
41
+ - **Querying database dumped data are supported.** After uploading tables from any database, you can search any data records just by asking.
42
+ - Instead of using SQL to query a database, every one cat get the wanted data just by asking using natrual language.
43
+ - The record number uploaded is not limited.
44
+ - Some extra description of column headers should be provided.
45
+ - **Reasoned and well-founded answers.** The cited document part in LLM's answer is provided and pointed out in the original document.
46
+ - The answers are based on retrieved result for which we apply vector-keyword hybrids search and rerank.
47
+ - The part of document cited in the answer is presented in the most expressive way.
48
+ - For PDF file, the cited parts in document can be located in the original PDF.
49
+
50
 
51
+ # Release Notification
52
+ **Star us on GitHub, and be notified for a new releases instantly!**
53
+ ![star-us](https://github.com/langgenius/dify/assets/100913391/95f37259-7370-4456-a9f0-0bc01ef8642f)
54
+
55
+ # Installation
56
+ ## System Requirements
57
+ Be aware of the system minimum requirements before starting installation.
58
+ - CPU >= 2 cores
59
+ - RAM >= 8GB
60
+
61
+ Then, you need to check the following command:
62
  ```bash
63
  121:/ragflow# sysctl vm.max_map_count
64
  vm.max_map_count = 262144
 
74
  vm.max_map_count=262144
75
  ```
76
 
77
+ ## Install docker
78
+
79
+ If your machine doesn't have *Docker* installed, please refer to [Install Docker Engine](https://docs.docker.com/engine/install/)
80
+
81
+ ## Quick Start
82
  > If you want to change the basic setups, like port, password .etc., please refer to [.env](./docker/.env) before starting the system.
83
 
84
  > If you change anything in [.env](./docker/.env), please check [service_conf.yaml](./docker/service_conf.yaml) which is a
 
91
  > [OpenAI](https://platform.openai.com/login?launch), [通义千问/QWen](https://dashscope.console.aliyun.com/model),
92
  > [智谱AI/ZhipuAI](https://open.bigmodel.cn/)
93
  ```bash
94
+ 121:/# git clone https://github.com/infiniflow/ragflow.git
95
+ 121:/# cd ragflow/docker
96
  121:/ragflow/docker# docker compose up -d
97
  ```
98
+ > The core image is about 15GB, please be patient for the first time
99
+
100
+ After pulling all the images and running up, use the following command to check the server status. If you can have the following outputs,
101
  _**Hallelujah!**_ You have successfully launched the system.
102
  ```bash
103
  121:/ragflow# docker logs -f ragflow-server
 
115
  INFO:werkzeug:Press CTRL+C to quit
116
 
117
  ```
118
+ Open your browser, enter the IP address of your server, _**Hallelujah**_ again!
119
+ > The default serving port is 80, if you want to change that, please refer to [docker-compose.yml](./docker-compose.yaml),
120
+ > and change the left part of *'80:80'*'.
121
+
122
+ # Configuration
123
+ If you need to change the default setting of the system when you deploy it. There several ways to configure it.
124
+ Please refer to [README](./docker/README.md) and manually set the configuration.
125
+ After changing something, please run *docker-compose up -d* again.
126
+
127
+ # RoadMap
128
+
129
+ - [ ] File manager.
130
+ - [ ] Support URLs. Crawl web and extract the main content.
131
+
132
+
133
+ # Contributing
134
+
135
+ For those who'd like to contribute code, see our [Contribution Guide](https://github.com/infiniflow/ragflow/blob/main/CONTRIBUTING.md).
136
+
137
+ # License
138
+
139
+ This repository is available under the [Ragflow Open Source License](LICENSE), which is essentially Apache 2.0 with a few additional restrictions.
api/apps/llm_app.py CHANGED
@@ -15,18 +15,12 @@
15
  #
16
  from flask import request
17
  from flask_login import login_required, current_user
18
-
19
- from api.db.services import duplicate_name
20
  from api.db.services.llm_service import LLMFactoriesService, TenantLLMService, LLMService
21
- from api.db.services.user_service import TenantService, UserTenantService
22
  from api.utils.api_utils import server_error_response, get_data_error_result, validate_request
23
- from api.utils import get_uuid, get_format_time
24
- from api.db import StatusEnum, UserTenantRole, LLMType
25
- from api.db.services.knowledgebase_service import KnowledgebaseService
26
- from api.db.db_models import Knowledgebase, TenantLLM
27
- from api.settings import stat_logger, RetCode
28
  from api.utils.api_utils import get_json_result
29
- from rag.llm import EmbeddingModel, CvModel, ChatModel
30
 
31
 
32
  @manager.route('/factories', methods=['GET'])
@@ -119,4 +113,5 @@ def list():
119
 
120
  return get_json_result(data=res)
121
  except Exception as e:
122
- return server_error_response(e)
 
 
15
  #
16
  from flask import request
17
  from flask_login import login_required, current_user
 
 
18
  from api.db.services.llm_service import LLMFactoriesService, TenantLLMService, LLMService
 
19
  from api.utils.api_utils import server_error_response, get_data_error_result, validate_request
20
+ from api.db import StatusEnum, LLMType
21
+ from api.db.db_models import TenantLLM
 
 
 
22
  from api.utils.api_utils import get_json_result
23
+ from rag.llm import EmbeddingModel, ChatModel
24
 
25
 
26
  @manager.route('/factories', methods=['GET'])
 
113
 
114
  return get_json_result(data=res)
115
  except Exception as e:
116
+ return server_error_response(e)
117
+
api/ragflow_server.py CHANGED
@@ -40,7 +40,7 @@ if __name__ == '__main__':
40
  /_/ |_| \__,_/ \__, //_/ /_/ \____/ |__/|__/
41
  /____/
42
 
43
- """)
44
  stat_logger.info(
45
  f'project base: {utils.file_utils.get_project_base_directory()}'
46
  )
 
40
  /_/ |_| \__,_/ \__, //_/ /_/ \____/ |__/|__/
41
  /____/
42
 
43
+ """, flush=True)
44
  stat_logger.info(
45
  f'project base: {utils.file_utils.get_project_base_directory()}'
46
  )
docker/README.md ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ # Docker Environment Variable
3
+
4
+ Look into [.env](./.env), there're some important variables.
5
+
6
+ ## MYSQL_PASSWORD
7
+ The mysql password could be changed by this variable. But you need to change *mysql.password* in [service_conf.yaml](./service_conf.yaml) at the same time.
8
+
9
+
10
+ ## MYSQL_PORT
11
+ It refers to exported port number of mysql docker container, it's useful if you want to access the database outside the docker containers.
12
+
13
+ ## MINIO_USER
14
+ It refers to user name of [Mino](https://github.com/minio/minio). The modification should be synchronous updating at minio.user of [service_conf.yaml](./service_conf.yaml).
15
+
16
+ ## MINIO_PASSWORD
17
+ It refers to user password of [Mino](https://github.com/minio/minio). The modification should be synchronous updating at minio.password of [service_conf.yaml](./service_conf.yaml).
18
+
19
+
20
+ ## SVR_HTTP_PORT
21
+ It refers to The API server serving port.
22
+
23
+
24
+ # Service Configuration
25
+ [service_conf.yaml](./service_conf.yaml) is used by the *API server* and *task executor*. It's the most important configuration of the system.
26
+
27
+ ## ragflow
28
+
29
+ ### host
30
+ The IP address used by the API server.
31
+
32
+ ### port
33
+ The serving port of API server.
34
+
35
+ ## mysql
36
+
37
+ ### name
38
+ The database name in mysql used by this system.
39
+
40
+ ### user
41
+ The database user name.
42
+
43
+ ### password
44
+ The database password. The modification should be synchronous updating at *MYSQL_PASSWORD* in [.env](./.env).
45
+
46
+ ### port
47
+ The serving port of mysql inside the container. The modification should be synchronous updating at [docker-compose.yml](./docker-compose.yml)
48
+
49
+ ### max_connections
50
+ The max database connection.
51
+
52
+ ### stale_timeout
53
+ The timeout duation in seconds.
54
+
55
+ ## minio
56
+
57
+ ### user
58
+ The username of minio. The modification should be synchronous updating at *MINIO_USER* in [.env](./.env).
59
+
60
+ ### password
61
+ The password of minio. The modification should be synchronous updating at *MINIO_PASSWORD* in [.env](./.env).
62
+
63
+ ### host
64
+ The serving IP and port inside the docker container. This is not updating until changing the minio part in [docker-compose.yml](./docker-compose.yml)
65
+
66
+ ## user_default_llm
67
+ Newly signed-up users use LLM configured by this part. Otherwise, user need to configure his own LLM in *setting*.
68
+
69
+ ### factory
70
+ The LLM suppliers. '通义千问', "OpenAI" and "智谱AI" are supported.
71
+
72
+ ### api_key
73
+ The corresponding API key of your assigned LLM vendor.
74
+
75
+ ## oauth
76
+ This is OAuth configuration which allows your system using the third-party account to sign-up and sign-in to the system.
77
+
78
+ ### github
79
+ Got to [Github](https://github.com/settings/developers), register new application, the *client_id* and *secret_key* will be given.
80
+