Add Text Embeddings Inference (TEI) tag & snippet
Browse filesThis PR adds the `text-embeddings-inference` tag into the `README.md` metadata to both let the community know that they can deploy `Alibaba-NLP/gte-multilingual-base` with Text Embeddings Inference (TEI), but also to improve discoverability within the Hub. Additionally, this PR also includes a snippet within the `README.md` under the "Usage" section on how to deploy `Alibaba-NLP/gte-multilingual-base` and send a request to the `/v1/embeddings` OpenAI-compatible endpoint.
Note that before TEI 1.6.1 in order to deploy `Alibaba-NLP/gte-multilingual-base` with TEI, one had to provide the `--revision refs/pr/7` as per https://huggingface.co/Alibaba-NLP/gte-multilingual-base/discussions/7 which is no longer required since TEI 1.6.1, since this model is handled within TEI as per https://github.com/huggingface/text-embeddings-inference/pull/538.
cc
@thenlper
for review and
@tomaarsen
for visibility
@@ -5,6 +5,7 @@ tags:
|
|
5 |
- transformers
|
6 |
- multilingual
|
7 |
- sentence-similarity
|
|
|
8 |
license: apache-2.0
|
9 |
language:
|
10 |
- af
|
@@ -4725,6 +4726,51 @@ michaelf34/infinity:0.0.69 \
|
|
4725 |
v2 --model-id Alibaba-NLP/gte-multilingual-base --revision "main" --dtype float16 --batch-size 32 --device cuda --engine torch --port 7997
|
4726 |
```
|
4727 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4728 |
### Use with custom code to get dense embeddings and sparse token weights
|
4729 |
```python
|
4730 |
# You can find the script gte_embedding.py in https://huggingface.co/Alibaba-NLP/gte-multilingual-base/blob/main/scripts/gte_embedding.py
|
|
|
5 |
- transformers
|
6 |
- multilingual
|
7 |
- sentence-similarity
|
8 |
+
- text-embeddings-inference
|
9 |
license: apache-2.0
|
10 |
language:
|
11 |
- af
|
|
|
4726 |
v2 --model-id Alibaba-NLP/gte-multilingual-base --revision "main" --dtype float16 --batch-size 32 --device cuda --engine torch --port 7997
|
4727 |
```
|
4728 |
|
4729 |
+
### Use with Text Embeddings Inference (TEI)
|
4730 |
+
|
4731 |
+
Usage via Docker and [Text Embeddings Inference (TEI)](https://github.com/huggingface/text-embeddings-inference):
|
4732 |
+
|
4733 |
+
- CPU:
|
4734 |
+
|
4735 |
+
```bash
|
4736 |
+
docker run --platform linux/amd64 \
|
4737 |
+
-p 8080:80 \
|
4738 |
+
-v $PWD/data:/data \
|
4739 |
+
--pull always \
|
4740 |
+
ghcr.io/huggingface/text-embeddings-inference:cpu-1.7 \
|
4741 |
+
--model-id Alibaba-NLP/gte-multilingual-base \
|
4742 |
+
--dtype float16
|
4743 |
+
```
|
4744 |
+
|
4745 |
+
- GPU:
|
4746 |
+
|
4747 |
+
```
|
4748 |
+
docker run --gpus all \
|
4749 |
+
-p 8080:80 \
|
4750 |
+
-v $PWD/data:/data \
|
4751 |
+
--pull always \
|
4752 |
+
ghcr.io/huggingface/text-embeddings-inference:1.7 \
|
4753 |
+
--model-id Alibaba-NLP/gte-multilingual-base \
|
4754 |
+
--dtype float16
|
4755 |
+
```
|
4756 |
+
|
4757 |
+
Then you can send requests to the deployed API via the OpenAI-compatible `v1/embeddings` route (more information about the [OpenAI Embeddings API](https://platform.openai.com/docs/api-reference/embeddings)):
|
4758 |
+
|
4759 |
+
```bash
|
4760 |
+
curl https://0.0.0.0:8080/v1/embeddings \
|
4761 |
+
-H "Content-Type: application/json" \
|
4762 |
+
-d '{
|
4763 |
+
"input": [
|
4764 |
+
"what is the capital of China?",
|
4765 |
+
"how to implement quick sort in python?",
|
4766 |
+
"北京",
|
4767 |
+
"快排算法介绍"
|
4768 |
+
],
|
4769 |
+
"model": "Alibaba-NLP/gte-multilingual-base",
|
4770 |
+
"encoding_format": "float"
|
4771 |
+
}'
|
4772 |
+
```
|
4773 |
+
|
4774 |
### Use with custom code to get dense embeddings and sparse token weights
|
4775 |
```python
|
4776 |
# You can find the script gte_embedding.py in https://huggingface.co/Alibaba-NLP/gte-multilingual-base/blob/main/scripts/gte_embedding.py
|