Update README.md
Browse files
README.md
CHANGED
|
@@ -2615,7 +2615,6 @@ language:
|
|
| 2615 |
<a href=#usage>Usage</a> |
|
| 2616 |
<a href="#evaluation">Evaluation</a> |
|
| 2617 |
<a href="#train">Train</a> |
|
| 2618 |
-
<a href="#contact">Contact</a> |
|
| 2619 |
<a href="#citation">Citation</a> |
|
| 2620 |
<a href="#license">License</a>
|
| 2621 |
<p>
|
|
@@ -2626,13 +2625,19 @@ More details please refer to our Github: [FlagEmbedding](https://github.com/Flag
|
|
| 2626 |
|
| 2627 |
[English](README.md) | [中文](https://github.com/FlagOpen/FlagEmbedding/blob/master/README_zh.md)
|
| 2628 |
|
| 2629 |
-
FlagEmbedding
|
| 2630 |
-
And it also can be used in vector databases for LLMs.
|
| 2631 |
|
| 2632 |
-
|
| 2633 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2634 |
- 09/15/2023: The [technical report](https://arxiv.org/pdf/2309.07597.pdf) of BGE has been released
|
| 2635 |
-
- 09/15/2023: The [
|
| 2636 |
- 09/12/2023: New models:
|
| 2637 |
- **New reranker model**: release cross-encoder models `BAAI/bge-reranker-base` and `BAAI/bge-reranker-large`, which are more powerful than embedding model. We recommend to use/fine-tune them to re-rank top-k documents returned by embedding models.
|
| 2638 |
- **update embedding model**: release `bge-*-v1.5` embedding model to alleviate the issue of the similarity distribution, and enhance its retrieval ability without instruction.
|
|
@@ -2657,6 +2662,7 @@ And it also can be used in vector databases for LLMs.
|
|
| 2657 |
|
| 2658 |
| Model | Language | | Description | query instruction for retrieval [1] |
|
| 2659 |
|:-------------------------------|:--------:| :--------:| :--------:|:--------:|
|
|
|
|
| 2660 |
| [BAAI/llm-embedder](https://huggingface.co/BAAI/llm-embedder) | English | [Inference](./FlagEmbedding/llm_embedder/README.md) [Fine-tune](./FlagEmbedding/llm_embedder/README.md) | a unified embedding model to support diverse retrieval augmentation needs for LLMs | See [README](./FlagEmbedding/llm_embedder/README.md) |
|
| 2661 |
| [BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large) | Chinese and English | [Inference](#usage-for-reranker) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker) | a cross-encoder model which is more accurate but less efficient [2] | |
|
| 2662 |
| [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) | Chinese and English | [Inference](#usage-for-reranker) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker) | a cross-encoder model which is more accurate but less efficient [2] | |
|
|
@@ -2986,9 +2992,6 @@ The data format is the same as embedding model, so you can fine-tune it easily f
|
|
| 2986 |
More details please refer to [./FlagEmbedding/reranker/README.md](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/reranker)
|
| 2987 |
|
| 2988 |
|
| 2989 |
-
## Contact
|
| 2990 |
-
If you have any question or suggestion related to this project, feel free to open an issue or pull request.
|
| 2991 |
-
You also can email Shitao Xiao([email protected]) and Zheng Liu([email protected]).
|
| 2992 |
|
| 2993 |
|
| 2994 |
## Citation
|
|
|
|
| 2615 |
<a href=#usage>Usage</a> |
|
| 2616 |
<a href="#evaluation">Evaluation</a> |
|
| 2617 |
<a href="#train">Train</a> |
|
|
|
|
| 2618 |
<a href="#citation">Citation</a> |
|
| 2619 |
<a href="#license">License</a>
|
| 2620 |
<p>
|
|
|
|
| 2625 |
|
| 2626 |
[English](README.md) | [中文](https://github.com/FlagOpen/FlagEmbedding/blob/master/README_zh.md)
|
| 2627 |
|
| 2628 |
+
FlagEmbedding focus on retrieval-augmented LLMs, consisting of following projects currently:
|
|
|
|
| 2629 |
|
| 2630 |
+
- **Fine-tuning of LM** : [LM-Cocktail](https://github.com/FlagOpen/FlagEmbedding/tree/master/LM_Cocktail)
|
| 2631 |
+
- **Dense Retrieval**: [LLM Embedder](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/llm_embedder), [BGE Embedding](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/baai_general_embedding), [C-MTEB](https://github.com/FlagOpen/FlagEmbedding/tree/master/C_MTEB)
|
| 2632 |
+
- **Reranker Model**: [BGE Reranker](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/reranker)
|
| 2633 |
+
|
| 2634 |
+
|
| 2635 |
+
## News
|
| 2636 |
+
|
| 2637 |
+
- 11/23/2023: Release [LM-Cocktail](https://github.com/FlagOpen/FlagEmbedding/tree/master/LM_Cocktail), a method to maintain general capabilities during fine-tuning by merging multiple language models. [Technical Report](https://arxiv.org/abs/2311.13534) :fire:
|
| 2638 |
+
- 10/12/2023: Release [LLM-Embedder](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/llm_embedder), a unified embedding model to support diverse retrieval augmentation needs for LLMs. [Technical Report](https://arxiv.org/pdf/2310.07554.pdf)
|
| 2639 |
- 09/15/2023: The [technical report](https://arxiv.org/pdf/2309.07597.pdf) of BGE has been released
|
| 2640 |
+
- 09/15/2023: The [massive training data](https://data.baai.ac.cn/details/BAAI-MTP) of BGE has been released
|
| 2641 |
- 09/12/2023: New models:
|
| 2642 |
- **New reranker model**: release cross-encoder models `BAAI/bge-reranker-base` and `BAAI/bge-reranker-large`, which are more powerful than embedding model. We recommend to use/fine-tune them to re-rank top-k documents returned by embedding models.
|
| 2643 |
- **update embedding model**: release `bge-*-v1.5` embedding model to alleviate the issue of the similarity distribution, and enhance its retrieval ability without instruction.
|
|
|
|
| 2662 |
|
| 2663 |
| Model | Language | | Description | query instruction for retrieval [1] |
|
| 2664 |
|:-------------------------------|:--------:| :--------:| :--------:|:--------:|
|
| 2665 |
+
| [LM-Cocktail](https://huggingface.co/Shitao) | English | | fine-tuned models (Llama and BGE) which can be used to reproduce the results of LM-Cocktail | |
|
| 2666 |
| [BAAI/llm-embedder](https://huggingface.co/BAAI/llm-embedder) | English | [Inference](./FlagEmbedding/llm_embedder/README.md) [Fine-tune](./FlagEmbedding/llm_embedder/README.md) | a unified embedding model to support diverse retrieval augmentation needs for LLMs | See [README](./FlagEmbedding/llm_embedder/README.md) |
|
| 2667 |
| [BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large) | Chinese and English | [Inference](#usage-for-reranker) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker) | a cross-encoder model which is more accurate but less efficient [2] | |
|
| 2668 |
| [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) | Chinese and English | [Inference](#usage-for-reranker) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker) | a cross-encoder model which is more accurate but less efficient [2] | |
|
|
|
|
| 2992 |
More details please refer to [./FlagEmbedding/reranker/README.md](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/reranker)
|
| 2993 |
|
| 2994 |
|
|
|
|
|
|
|
|
|
|
| 2995 |
|
| 2996 |
|
| 2997 |
## Citation
|