littlebird13 alvarobartt HF Staff commited on
Commit
5cf2132
·
verified ·
1 Parent(s): 408b81b

Update README.md with TEI support (#13)

Browse files

- Update README.md (afeb3eee5b8275037541e69bda662ec11dcc1581)


Co-authored-by: Alvaro Bartolome <[email protected]>

Files changed (1) hide show
  1. README.md +24 -0
README.md CHANGED
@@ -7,6 +7,7 @@ tags:
7
  - sentence-transformers
8
  - sentence-similarity
9
  - feature-extraction
 
10
  ---
11
  # Qwen3-Embedding-4B
12
 
@@ -203,6 +204,29 @@ print(scores.tolist())
203
 
204
  📌 **Tip**: We recommend that developers customize the `instruct` according to their specific scenarios, tasks, and languages. Our tests have shown that in most retrieval scenarios, not using an `instruct` on the query side can lead to a drop in retrieval performance by approximately 1% to 5%.
205
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
206
  ## Evaluation
207
 
208
  ### MTEB (Multilingual)
 
7
  - sentence-transformers
8
  - sentence-similarity
9
  - feature-extraction
10
+ - text-embeddings-inference
11
  ---
12
  # Qwen3-Embedding-4B
13
 
 
204
 
205
  📌 **Tip**: We recommend that developers customize the `instruct` according to their specific scenarios, tasks, and languages. Our tests have shown that in most retrieval scenarios, not using an `instruct` on the query side can lead to a drop in retrieval performance by approximately 1% to 5%.
206
 
207
+ ### Text Embeddings Inference (TEI) Usage
208
+
209
+ You can either run / deploy TEI on NVIDIA GPUs as:
210
+
211
+ ```bash
212
+ docker run --gpus all -p 8080:80 -v hf_cache:/data --pull always ghcr.io/huggingface/text-embeddings-inference:1.7.2 --model-id Qwen/Qwen3-Embedding-4B --dtype float16
213
+ ```
214
+
215
+ Or on CPU devices as:
216
+
217
+ ```bash
218
+ docker run -p 8080:80 -v hf_cache:/data --pull always ghcr.io/huggingface/text-embeddings-inference:cpu-1.7.2 --model-id Qwen/Qwen3-Embedding-4B --dtype float16
219
+ ```
220
+
221
+ And then, generate the embeddings sending a HTTP POST request as:
222
+
223
+ ```bash
224
+ curl http://localhost:8080/embed \
225
+ -X POST \
226
+ -d '{"inputs": ["Instruct: Given a web search query, retrieve relevant passages that answer the query\nQuery: What is the capital of China?", "Instruct: Given a web search query, retrieve relevant passages that answer the query\nQuery: Explain gravity"]}' \
227
+ -H "Content-Type: application/json"
228
+ ```
229
+
230
  ## Evaluation
231
 
232
  ### MTEB (Multilingual)