jinhai-2012 commited on
Commit
487f4f4
·
1 Parent(s): cd85d9e

HTTP API documents, part1 (#2713)

Browse files

### What problem does this PR solve?

1. dataset: create/delete/list/get/update
2. files in dataset: upload/download/list/delete/get_info

### Type of change

- [x] Documentation Update

---------

Signed-off-by: Jin Hai <[email protected]>

Files changed (1) hide show
  1. api/http_api.md +1167 -0
api/http_api.md ADDED
@@ -0,0 +1,1167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ # HTTP API Reference
3
+
4
+ ## Create dataset
5
+
6
+ **POST** `/api/v1/dataset`
7
+
8
+ Creates a dataset with a name. If dataset of the same name already exists, the new dataset will be renamed by RAGFlow automatically.
9
+
10
+ ### Request
11
+
12
+ - Method: POST
13
+ - URL: `/api/v1/dataset`
14
+ - Headers:
15
+ - `content-Type: application/json`
16
+ - 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
17
+ - Body:
18
+ - `"dataset_name"`: `string`
19
+ - `"tenant_id"`: `string`
20
+ - `"embedding_model"`: `string`
21
+ - `"chunk_count"`: `integer`
22
+ - `"document_count"`: `integer`
23
+ - `"parse_method"`: `string`
24
+
25
+ #### Request example
26
+
27
+ ```shell
28
+ curl --request POST \
29
+ --url http://{address}/api/v1/dataset \
30
+ --header 'Content-Type: application/json' \
31
+ --header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}' \
32
+ --data-binary '{
33
+ "dataset_name": "test",
34
+ "tenant_id": "4fb0cd625f9311efba4a0242ac120006",
35
+ "embedding_model": "BAAI/bge--zh-v1.5",
36
+ "chunk_count": 0,
37
+ "document_count": 0,
38
+ "parse_method": "general"
39
+ }'
40
+ ```
41
+
42
+ #### Request parameters
43
+
44
+ - `"dataset_name"`: (*Body parameter*)
45
+ The name of the dataset, which must adhere to the following requirements:
46
+ - Maximum 65,535 characters.
47
+ - `"tenant_id"`: (*Body parameter*)
48
+ The ID of the tenant.
49
+ - `"embedding_model"`: (*Body parameter*)
50
+ Embedding model used in the dataset.
51
+ - `"chunk_count"`: (*Body parameter*)
52
+ Chunk count of the dataset.
53
+ - `"document_count"`: (*Body parameter*)
54
+ Document count of the dataset.
55
+ - `"parse_mehtod"`: (*Body parameter*)
56
+ Parsing method of the dataset.
57
+
58
+ ### Response
59
+
60
+ The successful response includes a JSON object like the following:
61
+
62
+ ```shell
63
+ {
64
+ "code": 0
65
+ }
66
+ ```
67
+
68
+ - `"error_code"`: `integer`
69
+ `0`: The operation succeeds.
70
+
71
+
72
+ The error response includes a JSON object like the following:
73
+
74
+ ```shell
75
+ {
76
+ "code": 3016,
77
+ "message": "Can't connect database"
78
+ }
79
+ ```
80
+
81
+ ## Delete dataset
82
+
83
+ **DELETE** `/api/v1/dataset`
84
+
85
+ Deletes a dataset by its id or name.
86
+
87
+ ### Request
88
+
89
+ - Method: DELETE
90
+ - URL: `/api/v1/dataset/{dataset_id}`
91
+ - Headers:
92
+ - `content-Type: application/json`
93
+ - 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
94
+
95
+
96
+ #### Request example
97
+
98
+ ```shell
99
+ curl --request DELETE \
100
+ --url http://{address}/api/v1/dataset/0 \
101
+ --header 'Content-Type: application/json' \
102
+ --header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
103
+ --data ' {
104
+ "names": ["ds1", "ds2"]
105
+ }'
106
+ ```
107
+
108
+ #### Request parameters
109
+
110
+ - `"names"`: (*Body parameter*)
111
+ Dataset names to delete.
112
+ - `"ids"`: (*Body parameter*)
113
+ Dataset IDs to delete.
114
+
115
+ `"names"` and `"ids"` are exclusive.
116
+
117
+ ### Response
118
+
119
+ The successful response includes a JSON object like the following:
120
+
121
+ ```shell
122
+ {
123
+ "code": 0
124
+ }
125
+ ```
126
+
127
+ - `"error_code"`: `integer`
128
+ `0`: The operation succeeds.
129
+
130
+
131
+ The error response includes a JSON object like the following:
132
+
133
+ ```shell
134
+ {
135
+ "code": 3016,
136
+ "message": "Try to delete non-existent dataset."
137
+ }
138
+ ```
139
+
140
+ ## Update dataset
141
+
142
+ **PUT** `/api/v1/dataset/{dataset_id}`
143
+
144
+ Updates a dataset by its id.
145
+
146
+ ### Request
147
+
148
+ - Method: PUT
149
+ - URL: `/api/v1/dataset/{dataset_id}`
150
+ - Headers:
151
+ - `content-Type: application/json`
152
+ - 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
153
+
154
+
155
+ #### Request example
156
+
157
+ ```shell
158
+ curl --request PUT \
159
+ --url http://{address}/api/v1/dataset/0 \
160
+ --header 'Content-Type: application/json' \
161
+ --header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
162
+ --data-binary '{
163
+ "dataset_name": "test",
164
+ "tenant_id": "4fb0cd625f9311efba4a0242ac120006",
165
+ "embedding_model": "BAAI/bge--zh-v1.5",
166
+ "chunk_count": 0,
167
+ "document_count": 0,
168
+ "parse_method": "general"
169
+ }'
170
+ ```
171
+
172
+ #### Request parameters
173
+
174
+ - `"dataset_name"`: (*Body parameter*)
175
+ The name of the dataset, which must adhere to the following requirements:
176
+ - Maximum 65,535 characters.
177
+ - `"tenant_id"`: (*Body parameter*)
178
+ The ID of the tenant.
179
+ - `"embedding_model"`: (*Body parameter*)
180
+ Embedding model used in the dataset.
181
+ - `"chunk_count"`: (*Body parameter*)
182
+ Chunk count of the dataset.
183
+ - `"document_count"`: (*Body parameter*)
184
+ Document count of the dataset.
185
+ - `"parse_mehtod"`: (*Body parameter*)
186
+ Parsing method of the dataset.
187
+
188
+ ### Response
189
+
190
+ The successful response includes a JSON object like the following:
191
+
192
+ ```shell
193
+ {
194
+ "code": 0
195
+ }
196
+ ```
197
+
198
+ - `"error_code"`: `integer`
199
+ `0`: The operation succeeds.
200
+
201
+
202
+ The error response includes a JSON object like the following:
203
+
204
+ ```shell
205
+ {
206
+ "code": 3016,
207
+ "message": "Can't change embedding model since some files already use it."
208
+ }
209
+ ```
210
+
211
+ ## List datasets
212
+
213
+ **GET** `/api/v1/dataset?name={name}&page={page}&page_size={page_size}&orderby={orderby}&desc={desc}`
214
+
215
+ List all datasets
216
+
217
+ ### Request
218
+
219
+ - Method: GET
220
+ - URL: `/api/v1/dataset?name={name}&page={page}&page_size={page_size}&orderby={orderby}&desc={desc}`
221
+ - Headers:
222
+ - `content-Type: application/json`
223
+ - 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
224
+
225
+
226
+ #### Request example
227
+
228
+ ```shell
229
+ curl --request GET \
230
+ --url http://{address}/api/v1/dataset?page=0&page_size=50&orderby=create_time&desc=false \
231
+ --header 'Content-Type: application/json' \
232
+ --header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
233
+ ```
234
+
235
+ #### Request parameters
236
+
237
+ - `path`: (*Path parameter*)
238
+ The current page number to retrieve from the paginated data. This parameter determines which set of records will be fetched.
239
+ - `path_size`: (*Path parameter*)
240
+ The number of records to retrieve per page. This controls how many records will be included in each page.
241
+ - `orderby`: (*Path parameter*)
242
+ The field by which the records should be sorted. This specifies the attribute or column used to order the results.
243
+ - `desc`: (*Path parameter*)
244
+ A boolean flag indicating whether the sorting should be in descending order.
245
+ - `name`: (*Path parameter*)
246
+ Dataset name
247
+
248
+ ### Response
249
+
250
+ The successful response includes a JSON object like the following:
251
+
252
+ ```shell
253
+ {
254
+ "code": 0,
255
+ "data": [
256
+ {
257
+ "avatar": "",
258
+ "chunk_count": 0,
259
+ "create_date": "Thu, 29 Aug 2024 03:13:07 GMT",
260
+ "create_time": 1724901187843,
261
+ "created_by": "4fb0cd625f9311efba4a0242ac120006",
262
+ "description": "",
263
+ "document_count": 0,
264
+ "embedding_model": "BAAI/bge-large-zh-v1.5",
265
+ "id": "9d3d906665b411ef87d10242ac120006",
266
+ "language": "English",
267
+ "name": "Test",
268
+ "parser_config": {
269
+ "chunk_token_count": 128,
270
+ "delimiter": "\n!?。;!?",
271
+ "layout_recognize": true,
272
+ "task_page_size": 12
273
+ },
274
+ "parse_method": "naive",
275
+ "permission": "me",
276
+ "similarity_threshold": 0.2,
277
+ "status": "1",
278
+ "tenant_id": "4fb0cd625f9311efba4a0242ac120006",
279
+ "token_count": 0,
280
+ "update_date": "Thu, 29 Aug 2024 03:13:07 GMT",
281
+ "update_time": 1724901187843,
282
+ "vector_similarity_weight": 0.3
283
+ }
284
+ ],
285
+ }
286
+ ```
287
+
288
+
289
+ The error response includes a JSON object like the following:
290
+
291
+ ```shell
292
+ {
293
+ "code": 3016,
294
+ "message": "Can't access database to get the dataset list."
295
+ }
296
+ ```
297
+
298
+ ## Upload files to a dataset
299
+
300
+ **POST** `/api/v1/dataset/{dataset_id}/document`
301
+
302
+ Uploads files to a dataset.
303
+
304
+ ### Request
305
+
306
+ - Method: POST
307
+ - URL: `/api/v1/dataset/{dataset_id}/document`
308
+ - Headers:
309
+ - 'Content-Type: multipart/form-data'
310
+ - 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
311
+ - Form:
312
+ - 'file=@{FILE_PATH}'
313
+
314
+ #### Request example
315
+
316
+ ```shell
317
+ curl --request POST \
318
+ --url http://{address}/api/v1/dataset/{dataset_id}/document \
319
+ --header 'Content-Type: multipart/form-data' \
320
+ --header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}' \
321
+ --form '[email protected]'
322
+ ```
323
+
324
+ #### Request parameters
325
+
326
+ - `"dataset_id"`: (*Path parameter*)
327
+ The dataset id
328
+ - `"file"`: (*Body parameter*)
329
+ The file to upload
330
+
331
+ ### Response
332
+
333
+ The successful response includes a JSON object like the following:
334
+
335
+ ```shell
336
+ {
337
+ "code": 0
338
+ }
339
+ ```
340
+
341
+ - `"error_code"`: `integer`
342
+ `0`: The operation succeeds.
343
+
344
+
345
+ The error response includes a JSON object like the following:
346
+
347
+ ```shell
348
+ {
349
+ "code": 3016,
350
+ "message": "Can't connect database"
351
+ }
352
+ ```
353
+
354
+ ## Download a file from a dataset
355
+
356
+ **GET** `/api/v1/dataset/{dataset_id}/document/{document_id}`
357
+
358
+ Downloads files from a dataset.
359
+
360
+ ### Request
361
+
362
+ - Method: GET
363
+ - URL: `/api/v1/dataset/{dataset_id}/document/{document_id}`
364
+ - Headers:
365
+ - `content-Type: application/json`
366
+ - 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
367
+ - Output:
368
+ - '{FILE_NAME}'
369
+ #### Request example
370
+
371
+ ```shell
372
+ curl --request GET \
373
+ --url http://{address}/api/v1/dataset/{dataset_id}/document/{documents_id} \
374
+ --header 'Content-Type: application/json' \
375
+ --header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
376
+ --output '{FILE_NAME}'
377
+ ```
378
+
379
+ #### Request parameters
380
+
381
+ - `"dataset_id"`: (*PATH parameter*)
382
+ The dataset id
383
+ - `"documents_id"`: (*PATH parameter*)
384
+ The document id of the file.
385
+
386
+ ### Response
387
+
388
+ The successful response includes a JSON object like the following:
389
+
390
+ ```shell
391
+ {
392
+ "code": 0
393
+ }
394
+ ```
395
+
396
+ - `"error_code"`: `integer`
397
+ `0`: The operation succeeds.
398
+
399
+
400
+ The error response includes a JSON object like the following:
401
+
402
+ ```shell
403
+ {
404
+ "code": 3016,
405
+ "message": "Can't connect database"
406
+ }
407
+ ```
408
+
409
+
410
+ ## List files of a dataset
411
+
412
+ **GET** `/api/v1/dataset/{dataset_id}/info?keywords={keyword}&page={page}&page_size={limit}&orderby={orderby}&desc={desc}&name={name}`
413
+
414
+ List files to a dataset.
415
+
416
+ ### Request
417
+
418
+ - Method: GET
419
+ - URL: `/api/v1/dataset/{dataset_id}/info?keywords={keyword}&page={page}&page_size={limit}&orderby={orderby}&desc={desc}&name={name`
420
+ - Headers:
421
+ - `content-Type: application/json`
422
+ - 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
423
+
424
+ #### Request example
425
+
426
+ ```shell
427
+ curl --request GET \
428
+ --url http://{address}/api/v1/dataset/{dataset_id}/info?keywords=rag&page=0&page_size=10&orderby=create_time&desc=yes \
429
+ --header 'Content-Type: application/json' \
430
+ --header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
431
+ ```
432
+
433
+ #### Request parameters
434
+
435
+ - `"dataset_id"`: (*PATH parameter*)
436
+ The dataset id
437
+ - `keywords`: (*Filter parameter*)
438
+ The keywords matches the search key workds;
439
+ - `page`: (*Filter parameter*)
440
+ The current page number to retrieve from the paginated data. This parameter determines which set of records will be fetched.
441
+ - `page_size`: (*Filter parameter*)
442
+ The number of records to retrieve per page. This controls how many records will be included in each page.
443
+ - `orderby`: (*Filter parameter*)
444
+ The field by which the records should be sorted. This specifies the attribute or column used to order the results.
445
+ - `desc`: (*Filter parameter*)
446
+ A boolean flag indicating whether the sorting should be in descending order.
447
+ - `name`: (*Filter parameter*)
448
+ File name.
449
+
450
+ ### Response
451
+
452
+ The successful response includes a JSON object like the following:
453
+
454
+ ```shell
455
+ {
456
+ "code": 0,
457
+ "data": {
458
+ "docs": [
459
+ {
460
+ "chunk_count": 0,
461
+ "create_date": "Wed, 18 Sep 2024 08:20:49 GMT",
462
+ "create_time": 1726647649379,
463
+ "created_by": "134408906b6811efbcd20242ac120005",
464
+ "id": "e970a94a759611efae5b0242ac120004",
465
+ "knowledgebase_id": "e95f574e759611efbc850242ac120004",
466
+ "location": "Test Document222.txt",
467
+ "name": "Test Document222.txt",
468
+ "parser_config": {
469
+ "chunk_token_count": 128,
470
+ "delimiter": "\n!?。;!?",
471
+ "layout_recognize": true,
472
+ "task_page_size": 12
473
+ },
474
+ "parser_method": "naive",
475
+ "process_begin_at": null,
476
+ "process_duation": 0.0,
477
+ "progress": 0.0,
478
+ "progress_msg": "",
479
+ "run": "0",
480
+ "size": 46,
481
+ "source_type": "local",
482
+ "status": "1",
483
+ "thumbnail": null,
484
+ "token_count": 0,
485
+ "type": "doc",
486
+ "update_date": "Wed, 18 Sep 2024 08:20:49 GMT",
487
+ "update_time": 1726647649379
488
+ },
489
+ {
490
+ "chunk_count": 0,
491
+ "create_date": "Wed, 18 Sep 2024 08:20:49 GMT",
492
+ "create_time": 1726647649340,
493
+ "created_by": "134408906b6811efbcd20242ac120005",
494
+ "id": "e96aad9c759611ef9ab60242ac120004",
495
+ "knowledgebase_id": "e95f574e759611efbc850242ac120004",
496
+ "location": "Test Document111.txt",
497
+ "name": "Test Document111.txt",
498
+ "parser_config": {
499
+ "chunk_token_count": 128,
500
+ "delimiter": "\n!?。;!?",
501
+ "layout_recognize": true,
502
+ "task_page_size": 12
503
+ },
504
+ "parser_method": "naive",
505
+ "process_begin_at": null,
506
+ "process_duation": 0.0,
507
+ "progress": 0.0,
508
+ "progress_msg": "",
509
+ "run": "0",
510
+ "size": 46,
511
+ "source_type": "local",
512
+ "status": "1",
513
+ "thumbnail": null,
514
+ "token_count": 0,
515
+ "type": "doc",
516
+ "update_date": "Wed, 18 Sep 2024 08:20:49 GMT",
517
+ "update_time": 1726647649340
518
+ }
519
+ ],
520
+ "total": 2
521
+ },
522
+ }
523
+ ```
524
+
525
+ - `"error_code"`: `integer`
526
+ `0`: The operation succeeds.
527
+
528
+
529
+ The error response includes a JSON object like the following:
530
+
531
+ ```shell
532
+ {
533
+ "code": 3016,
534
+ "message": "Can't connect database"
535
+ }
536
+ ```
537
+
538
+ ## Update a file information in dataset
539
+
540
+ **PUT** `/api/v1/dataset/{dataset_id}/info/{document_id}`
541
+
542
+ Update a file in a dataset
543
+
544
+ ### Request
545
+
546
+ - Method: PUT
547
+ - URL: `/api/v1/dataset/{dataset_id}/document`
548
+ - Headers:
549
+ - `content-Type: application/json`
550
+ - 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
551
+
552
+ #### Request example
553
+
554
+ ```shell
555
+ curl --request PUT \
556
+ --url http://{address}/api/v1/dataset/{dataset_id}/info/{document_id} \
557
+ --header 'Content-Type: application/json' \
558
+ --header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
559
+ --raw '{
560
+ "document_id": "f6b170ac758811efa0660242ac120004",
561
+ "document_name": "manual.txt",
562
+ "thumbnail": null,
563
+ "knowledgebase_id": "779333c0758611ef910f0242ac120004",
564
+ "parser_method": "manual",
565
+ "parser_config": {"chunk_token_count": 128, "delimiter": "\n!?。;!?", "layout_recognize": true, "task_page_size": 12},
566
+ "source_type": "local", "type": "doc",
567
+ "created_by": "134408906b6811efbcd20242ac120005",
568
+ "size": 0, "token_count": 0, "chunk_count": 0,
569
+ "progress": 0.0,
570
+ "progress_msg": "",
571
+ "process_begin_at": null,
572
+ "process_duration": 0.0
573
+ }'
574
+ ```
575
+
576
+ #### Request parameters
577
+
578
+ - `"document_id"`: (*Body parameter*)
579
+ - `"document_name"`: (*Body parameter*)
580
+
581
+ ### Response
582
+
583
+ The successful response includes a JSON object like the following:
584
+
585
+ ```shell
586
+ {
587
+ "code": 0
588
+ }
589
+ ```
590
+
591
+ The error response includes a JSON object like the following:
592
+
593
+ ```shell
594
+ {
595
+ "code": 3016,
596
+ "message": "Can't connect database"
597
+ }
598
+ ```
599
+
600
+ ## Parse files in dataset
601
+
602
+ **POST** `/api/v1/dataset/{dataset_id}/chunk`
603
+
604
+ Parse files into chunks in a dataset
605
+
606
+ ### Request
607
+
608
+ - Method: POST
609
+ - URL: `/api/v1/dataset/{dataset_id}/chunk`
610
+ - Headers:
611
+ - `content-Type: application/json`
612
+ - 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
613
+
614
+ #### Request example
615
+
616
+ ```shell
617
+ curl --request POST \
618
+ --url http://{address}/api/v1/dataset/{dataset_id}/chunk \
619
+ --header 'Content-Type: application/json' \
620
+ --header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
621
+ --raw '{
622
+ "documents": ["f6b170ac758811efa0660242ac120004", "97ad64b6759811ef9fc30242ac120004"]
623
+ }'
624
+ ```
625
+
626
+ #### Request parameters
627
+
628
+ - `"dataset_id"`: (*Path parameter*)
629
+ - `"documents"`: (*Body parameter*)
630
+ - Documents to parse
631
+
632
+ ### Response
633
+
634
+ The successful response includes a JSON object like the following:
635
+
636
+ ```shell
637
+ {
638
+ "code": 0
639
+ }
640
+ ```
641
+
642
+ The error response includes a JSON object like the following:
643
+
644
+ ```shell
645
+ {
646
+ "code": 3016,
647
+ "message": "Can't connect database"
648
+ }
649
+ ```
650
+
651
+ ## Stop file parsing
652
+
653
+ **DELETE** `/api/v1/dataset/{dataset_id}/chunk`
654
+
655
+ Stop file parsing
656
+
657
+ ### Request
658
+
659
+ - Method: POST
660
+ - URL: `/api/v1/dataset/{dataset_id}/chunk`
661
+ - Headers:
662
+ - `content-Type: application/json`
663
+ - 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
664
+
665
+ #### Request example
666
+
667
+ ```shell
668
+ curl --request DELETE \
669
+ --url http://{address}/api/v1/dataset/{dataset_id}/chunk \
670
+ --header 'Content-Type: application/json' \
671
+ --header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
672
+ --raw '{
673
+ "documents": ["f6b170ac758811efa0660242ac120004", "97ad64b6759811ef9fc30242ac120004"]
674
+ }'
675
+ ```
676
+
677
+ #### Request parameters
678
+
679
+ - `"dataset_id"`: (*Path parameter*)
680
+ - `"documents"`: (*Body parameter*)
681
+ - Documents to stop parsing
682
+
683
+ ### Response
684
+
685
+ The successful response includes a JSON object like the following:
686
+
687
+ ```shell
688
+ {
689
+ "code": 0
690
+ }
691
+ ```
692
+
693
+ The error response includes a JSON object like the following:
694
+
695
+ ```shell
696
+ {
697
+ "code": 3016,
698
+ "message": "Can't connect database"
699
+ }
700
+ ```
701
+
702
+ ## Get document chunk list
703
+
704
+ **GET** `/api/v1/dataset/{dataset_id}/document/{document_id}/chunk`
705
+
706
+ Get document chunk list
707
+
708
+ ### Request
709
+
710
+ - Method: GET
711
+ - URL: `/api/v1/dataset/{dataset_id}/document/{document_id}/chunk`
712
+ - Headers:
713
+ - `content-Type: application/json`
714
+ - 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
715
+
716
+ #### Request example
717
+
718
+ ```shell
719
+ curl --request GET \
720
+ --url http://{address}/api/v1/dataset/{dataset_id}/document/{document_id}/chunk \
721
+ --header 'Content-Type: application/json' \
722
+ --header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
723
+ ```
724
+
725
+ #### Request parameters
726
+
727
+ - `"dataset_id"`: (*Path parameter*)
728
+ - `"document_id"`: (*Path parameter*)
729
+
730
+ ### Response
731
+
732
+ The successful response includes a JSON object like the following:
733
+
734
+ ```shell
735
+ {
736
+ "code": 0
737
+ "data": {
738
+ "chunks": [
739
+ {
740
+ "available_int": 1,
741
+ "content": "<em>advantag</em>of ragflow increas accuraci and relev:by incorpor retriev inform , ragflow can gener respons that are more accur",
742
+ "document_keyword": "ragflow_test.txt",
743
+ "document_id": "77df9ef4759a11ef8bdd0242ac120004",
744
+ "id": "4ab8c77cfac1a829c8d5ed022a0808c0",
745
+ "image_id": "",
746
+ "important_keywords": [],
747
+ "positions": [
748
+ ""
749
+ ]
750
+ }
751
+ ],
752
+ "doc": {
753
+ "chunk_count": 5,
754
+ "create_date": "Wed, 18 Sep 2024 08:46:16 GMT",
755
+ "create_time": 1726649176833,
756
+ "created_by": "134408906b6811efbcd20242ac120005",
757
+ "id": "77df9ef4759a11ef8bdd0242ac120004",
758
+ "knowledgebase_id": "77d9d24e759a11ef880c0242ac120004",
759
+ "location": "ragflow_test.txt",
760
+ "name": "ragflow_test.txt",
761
+ "parser_config": {
762
+ "chunk_token_count": 128,
763
+ "delimiter": "\n!?。;!?",
764
+ "layout_recognize": true,
765
+ "task_page_size": 12
766
+ },
767
+ "parser_method": "naive",
768
+ "process_begin_at": "Wed, 18 Sep 2024 08:46:16 GMT",
769
+ "process_duation": 7.3213,
770
+ "progress": 1.0,
771
+ "progress_msg": "\nTask has been received.\nStart to parse.\nFinish parsing.\nFinished slicing files(5). Start to embedding the content.\nFinished embedding(6.16)! Start to build index!\nDone!",
772
+ "run": "3",
773
+ "size": 4209,
774
+ "source_type": "local",
775
+ "status": "1",
776
+ "thumbnail": null,
777
+ "token_count": 746,
778
+ "type": "doc",
779
+ "update_date": "Wed, 18 Sep 2024 08:46:23 GMT",
780
+ "update_time": 1726649183321
781
+ },
782
+ "total": 1
783
+ },
784
+ }
785
+ ```
786
+
787
+ The error response includes a JSON object like the following:
788
+
789
+ ```shell
790
+ {
791
+ "code": 3016,
792
+ "message": "Can't connect database"
793
+ }
794
+ ```
795
+
796
+ ## Delete document chunks
797
+
798
+ **DELETE** `/api/v1/dataset/{dataset_id}/document/{document_id}/chunk`
799
+
800
+ Delete document chunks
801
+
802
+ ### Request
803
+
804
+ - Method: DELETE
805
+ - URL: `/api/v1/dataset/{dataset_id}/document/{document_id}/chunk`
806
+ - Headers:
807
+ - `content-Type: application/json`
808
+ - 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
809
+
810
+ #### Request example
811
+
812
+ ```shell
813
+ curl --request DELETE \
814
+ --url http://{address}/api/v1/dataset/{dataset_id}/document/{document_id}/chunk \
815
+ --header 'Content-Type: application/json' \
816
+ --header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
817
+ --raw '{
818
+ "chunks": ["f6b170ac758811efa0660242ac120004", "97ad64b6759811ef9fc30242ac120004"]
819
+ }'
820
+ ```
821
+
822
+ ## Update document chunk
823
+
824
+ **PUT** `/api/v1/dataset/{dataset_id}/document/{document_id}/chunk`
825
+
826
+ Update document chunk
827
+
828
+ ### Request
829
+
830
+ - Method: PUT
831
+ - URL: `/api/v1/dataset/{dataset_id}/document/{document_id}/chunk`
832
+ - Headers:
833
+ - `content-Type: application/json`
834
+ - 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
835
+
836
+ #### Request example
837
+
838
+ ```shell
839
+ curl --request PUT \
840
+ --url http://{address}/api/v1/dataset/{dataset_id}/document/{document_id}/chunk \
841
+ --header 'Content-Type: application/json' \
842
+ --header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
843
+ --raw '{
844
+ "chunk_id": "d87fb0b7212c15c18d0831677552d7de",
845
+ "knowledgebase_id": null,
846
+ "name": "",
847
+ "content": "ragflow123",
848
+ "important_keywords": [],
849
+ "document_id": "e6bbba92759511efaa900242ac120004",
850
+ "status": "1"
851
+ }'
852
+ ```
853
+
854
+ ## Insert document chunks
855
+
856
+ **POST** `/api/v1/dataset/{dataset_id}/document/{document_id}/chunk`
857
+
858
+ Insert document chunks
859
+
860
+ ### Request
861
+
862
+ - Method: POST
863
+ - URL: `/api/v1/dataset/{dataset_id}/document/{document_id}/chunk`
864
+ - Headers:
865
+ - `content-Type: application/json`
866
+ - 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
867
+
868
+ #### Request example
869
+
870
+ ```shell
871
+ curl --request POST \
872
+ --url http://{address}/api/v1/dataset/{dataset_id}/document/{document_id}/chunk \
873
+ --header 'Content-Type: application/json' \
874
+ --header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
875
+ --raw '{
876
+ "document_id": "97ad64b6759811ef9fc30242ac120004",
877
+ "content": ["ragflow content", "ragflow content"]
878
+ }'
879
+ ```
880
+
881
+ ## Dataset retrieval test
882
+
883
+ **GET** `/api/v1/dataset/{dataset_id}/retrieval`
884
+
885
+ Retrieval test of a dataset
886
+
887
+ ### Request
888
+
889
+ - Method: GET
890
+ - URL: `/api/v1/dataset/{dataset_id}/retrieval`
891
+ - Headers:
892
+ - `content-Type: application/json`
893
+ - 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
894
+
895
+ #### Request example
896
+
897
+ ```shell
898
+ curl --request GET \
899
+ --url http://{address}/api/v1/dataset/{dataset_id}/retrieval \
900
+ --header 'Content-Type: application/json' \
901
+ --header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
902
+ --raw '{
903
+ "query_text": "This is a cat."
904
+ }'
905
+ ```
906
+
907
+ ## Create chat
908
+
909
+ **POST** `/api/v1/chat`
910
+
911
+ Create a chat
912
+
913
+ ### Request
914
+
915
+ - Method: POST
916
+ - URL: `/api/v1/chat`
917
+ - Headers:
918
+ - `content-Type: application/json`
919
+ - 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
920
+
921
+ #### Request example
922
+
923
+ ```shell
924
+ curl --request POST \
925
+ --url http://{address}/api/v1/chat \
926
+ --header 'Content-Type: application/json' \
927
+ --header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
928
+ --data-binary '{
929
+ "avatar": "path",
930
+ "create_date": "Wed, 04 Sep 2024 10:08:01 GMT",
931
+ "create_time": 1725444481128,
932
+ "description": "A helpful Assistant",
933
+ "do_refer": "",
934
+ "knowledgebases": [
935
+ {
936
+ "avatar": null,
937
+ "chunk_count": 0,
938
+ "description": null,
939
+ "document_count": 0,
940
+ "embedding_model": "",
941
+ "id": "d6d0e8e868cd11ef92250242ac120006",
942
+ "language": "English",
943
+ "name": "Test_assistant",
944
+ "parse_method": "naive",
945
+ "parser_config": {
946
+ "pages": [
947
+ [
948
+ 1,
949
+ 1000000
950
+ ]
951
+ ]
952
+ },
953
+ "permission": "me",
954
+ "tenant_id": "4fb0cd625f9311efba4a0242ac120006"
955
+ }
956
+ ],
957
+ "language": "English",
958
+ "llm": {
959
+ "frequency_penalty": 0.7,
960
+ "max_tokens": 512,
961
+ "model_name": "deepseek-chat",
962
+ "presence_penalty": 0.4,
963
+ "temperature": 0.1,
964
+ "top_p": 0.3
965
+ },
966
+ "name": "Miss R",
967
+ "prompt": {
968
+ "empty_response": "Sorry! Can't find the context!",
969
+ "keywords_similarity_weight": 0.7,
970
+ "opener": "Hi! I am your assistant, what can I do for you?",
971
+ "prompt": "You are an intelligent assistant. Please summarize the content of the knowledge base to answer the question. Please list the data in the knowledge base and answer in detail. When all knowledge base content is irrelevant to the question, your answer must include the sentence 'The answer you are looking for is not found in the knowledge base!' Answers need to consider chat history.\nHere is the knowledge base:\n{knowledge}\nThe above is the knowledge base.",
972
+ "rerank_model": "",
973
+ "show_quote": true,
974
+ "similarity_threshold": 0.2,
975
+ "top_n": 8,
976
+ "variables": [
977
+ {
978
+ "key": "knowledge",
979
+ "optional": true
980
+ }
981
+ ]
982
+ },
983
+ "prompt_type": "simple",
984
+ "status": "1",
985
+ "top_k": 1024,
986
+ "update_date": "Wed, 04 Sep 2024 10:08:01 GMT",
987
+ "update_time": 1725444481128
988
+ }'
989
+ ```
990
+
991
+ ## Update chat
992
+
993
+ **PUT** `/api/v1/chat`
994
+
995
+ Update a chat
996
+
997
+ ### Request
998
+
999
+ - Method: PUT
1000
+ - URL: `/api/v1/chat`
1001
+ - Headers:
1002
+ - `content-Type: application/json`
1003
+ - 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
1004
+
1005
+ #### Request example
1006
+
1007
+ curl --request PUT \
1008
+ --url http://{address}/api/v1/chat \
1009
+ --header 'Content-Type: application/json' \
1010
+ --header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}' \
1011
+ --data-binary '{
1012
+ "id":"554e96746aaa11efb06b0242ac120005",
1013
+ "name":"Test"
1014
+ }'
1015
+
1016
+ ## Delete chat
1017
+
1018
+ **DELETE** `/api/v1/chat/{chat_id}`
1019
+
1020
+ Delete a chat
1021
+
1022
+ ### Request
1023
+
1024
+ - Method: PUT
1025
+ - URL: `/api/v1/chat/{chat_id}`
1026
+ - Headers:
1027
+ - `content-Type: application/json`
1028
+ - 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
1029
+
1030
+ #### Request example
1031
+
1032
+ curl --request PUT \
1033
+ --url http://{address}/api/v1/chat/554e96746aaa11efb06b0242ac120005 \
1034
+ --header 'Content-Type: application/json' \
1035
+ --header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
1036
+ }'
1037
+
1038
+ ## List chat
1039
+
1040
+ **GET** `/api/v1/chat`
1041
+
1042
+ List all chat assistants
1043
+
1044
+ ### Request
1045
+
1046
+ - Method: GET
1047
+ - URL: `/api/v1/chat`
1048
+ - Headers:
1049
+ - `content-Type: application/json`
1050
+ - 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
1051
+
1052
+ #### Request example
1053
+
1054
+ curl --request GET \
1055
+ --url http://{address}/api/v1/chat \
1056
+ --header 'Content-Type: application/json' \
1057
+ --header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
1058
+
1059
+ ## Create a chat session
1060
+
1061
+ **POST** `/api/v1/chat/{chat_id}/session`
1062
+
1063
+ Create a chat session
1064
+
1065
+ ### Request
1066
+
1067
+ - Method: POST
1068
+ - URL: `/api/v1/chat/{chat_id}/session`
1069
+ - Headers:
1070
+ - `content-Type: application/json`
1071
+ - 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
1072
+
1073
+ #### Request example
1074
+ curl --request POST \
1075
+ --url http://{address}/api/v1/chat/{chat_id}/session \
1076
+ --header 'Content-Type: application/json' \
1077
+ --header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}' \
1078
+ --data-binary '{
1079
+ "name": "new session"
1080
+ }'
1081
+
1082
+ ## List the sessions of a chat
1083
+
1084
+ **GET** `/api/v1/chat/{chat_id}/session`
1085
+
1086
+ List all the session of a chat
1087
+
1088
+ ### Request
1089
+
1090
+ - Method: GET
1091
+ - URL: `/api/v1/chat/{chat_id}/session`
1092
+ - Headers:
1093
+ - `content-Type: application/json`
1094
+ - 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
1095
+
1096
+ #### Request example
1097
+ curl --request GET \
1098
+ --url http://{address}/api/v1/chat/554e96746aaa11efb06b0242ac120005/session \
1099
+ --header 'Content-Type: application/json' \
1100
+ --header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
1101
+
1102
+ ## Delete a chat session
1103
+
1104
+ **DELETE** `/api/v1/chat/{chat_id}/session/{session_id}`
1105
+
1106
+ Delete a chat session
1107
+
1108
+ ### Request
1109
+
1110
+ - Method: DELETE
1111
+ - URL: `/api/v1/chat/{chat_id}/session/{session_id}`
1112
+ - Headers:
1113
+ - `content-Type: application/json`
1114
+ - 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
1115
+
1116
+ #### Request example
1117
+ curl --request DELETE \
1118
+ --url http://{address}/api/v1/chat/554e96746aaa11efb06b0242ac120005/session/791aed9670ea11efbb7e0242ac120007 \
1119
+ --header 'Content-Type: application/json' \
1120
+ --header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
1121
+
1122
+ ## Update a chat session
1123
+
1124
+ **PUT** `/api/v1/chat/{chat_id}/session/{session_id}`
1125
+
1126
+ Update a chat session
1127
+
1128
+ ### Request
1129
+
1130
+ - Method: PUT
1131
+ - URL: `/api/v1/chat/{chat_id}/session/{session_id}`
1132
+ - Headers:
1133
+ - `content-Type: application/json`
1134
+ - 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
1135
+
1136
+ #### Request example
1137
+ curl --request PUT \
1138
+ --url http://{address}/api/v1/chat/554e96746aaa11efb06b0242ac120005/session/791aed9670ea11efbb7e0242ac120007 \
1139
+ --header 'Content-Type: application/json' \
1140
+ --header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
1141
+ --data-binary '{
1142
+ "name": "Updated session"
1143
+ }'
1144
+
1145
+ ## Chat with a chat session
1146
+
1147
+ **POST** `/api/v1/chat/{chat_id}/session/{session_id}/completion`
1148
+
1149
+ Chat with a chat session
1150
+
1151
+ ### Request
1152
+
1153
+ - Method: POST
1154
+ - URL: `/api/v1/chat/{chat_id}/session/{session_id}/completion`
1155
+ - Headers:
1156
+ - `content-Type: application/json`
1157
+ - 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
1158
+
1159
+ #### Request example
1160
+ curl --request POST \
1161
+ --url http://{address}/api/v1/chat/554e96746aaa11efb06b0242ac120005/session/791aed9670ea11efbb7e0242ac120007/completion \
1162
+ --header 'Content-Type: application/json' \
1163
+ --header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
1164
+ --data-binary '{
1165
+ "question": "Hello!",
1166
+ "stream": true,
1167
+ }'