janni-t commited on
Commit
0c2b7f7
·
verified ·
1 Parent(s): aae906c

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +61 -0
README.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: Qwen/Qwen3-Embedding-0.6B
4
+ tags:
5
+ - qwen3
6
+ - text-embeddings-inference
7
+ - onnx
8
+ - sentence-transformers
9
+ - feature-extraction
10
+ - sentence-similarity
11
+ language:
12
+ - multilingual
13
+ pipeline_tag: sentence-similarity
14
+ library_name: sentence-transformers
15
+ ---
16
+
17
+ # Qwen3-Embedding-0.6B ONNX for TEI
18
+
19
+ This is an ONNX version of [Qwen/Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B) optimized for [Text Embeddings Inference (TEI)](https://github.com/huggingface/text-embeddings-inference).
20
+
21
+ ## Model Details
22
+
23
+ - **Base Model**: Qwen/Qwen3-Embedding-0.6B
24
+ - **Format**: ONNX with external data (`model.onnx` + `model.onnx_data`)
25
+ - **Pooling**: Mean pooling (built into the ONNX graph)
26
+ - **Embedding Dimension**: 1024
27
+ - **Max Sequence Length**: 32768 tokens
28
+
29
+ ## Usage with TEI
30
+
31
+ ```bash
32
+ docker run --gpus all -p 8080:80 -v $PWD:/data \
33
+ ghcr.io/huggingface/text-embeddings-inference:latest \
34
+ --model-id janni-t/qwen3-embedding-0.6b-tei-onnx
35
+ ```
36
+
37
+ For CPU inference:
38
+ ```bash
39
+ docker run -p 8080:80 -v $PWD:/data \
40
+ ghcr.io/huggingface/text-embeddings-inference:cpu-latest \
41
+ --model-id janni-t/qwen3-embedding-0.6b-tei-onnx
42
+ ```
43
+
44
+ ## Conversion Details
45
+
46
+ This model was converted from the original PyTorch model to ONNX format with:
47
+ - Consolidated external data for TEI compatibility
48
+ - Mean pooling integrated into the ONNX graph
49
+ - Optimized for CPU inference
50
+
51
+ ## Original Model
52
+
53
+ See the original model card at [Qwen/Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B) for:
54
+ - Model architecture details
55
+ - Training information
56
+ - Benchmark results
57
+ - Citation information
58
+
59
+ ## License
60
+
61
+ Apache 2.0 (same as the original model)