File size: 1,672 Bytes
0c2b7f7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
---
license: apache-2.0
base_model: Qwen/Qwen3-Embedding-0.6B
tags:
- qwen3
- text-embeddings-inference
- onnx
- sentence-transformers
- feature-extraction
- sentence-similarity
language:
- multilingual
pipeline_tag: sentence-similarity
library_name: sentence-transformers
---
# Qwen3-Embedding-0.6B ONNX for TEI
This is an ONNX version of [Qwen/Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B) optimized for [Text Embeddings Inference (TEI)](https://github.com/huggingface/text-embeddings-inference).
## Model Details
- **Base Model**: Qwen/Qwen3-Embedding-0.6B
- **Format**: ONNX with external data (`model.onnx` + `model.onnx_data`)
- **Pooling**: Mean pooling (built into the ONNX graph)
- **Embedding Dimension**: 1024
- **Max Sequence Length**: 32768 tokens
## Usage with TEI
```bash
docker run --gpus all -p 8080:80 -v $PWD:/data \
ghcr.io/huggingface/text-embeddings-inference:latest \
--model-id janni-t/qwen3-embedding-0.6b-tei-onnx
```
For CPU inference:
```bash
docker run -p 8080:80 -v $PWD:/data \
ghcr.io/huggingface/text-embeddings-inference:cpu-latest \
--model-id janni-t/qwen3-embedding-0.6b-tei-onnx
```
## Conversion Details
This model was converted from the original PyTorch model to ONNX format with:
- Consolidated external data for TEI compatibility
- Mean pooling integrated into the ONNX graph
- Optimized for CPU inference
## Original Model
See the original model card at [Qwen/Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B) for:
- Model architecture details
- Training information
- Benchmark results
- Citation information
## License
Apache 2.0 (same as the original model) |