--- license: apache-2.0 base_model: Qwen/Qwen3-Embedding-0.6B tags: - qwen3 - text-embeddings-inference - onnx - sentence-transformers - feature-extraction - sentence-similarity language: - multilingual pipeline_tag: sentence-similarity library_name: sentence-transformers --- # Qwen3-Embedding-0.6B ONNX for TEI This is an ONNX version of [Qwen/Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B) optimized for [Text Embeddings Inference (TEI)](https://github.com/huggingface/text-embeddings-inference). ## Model Details - **Base Model**: Qwen/Qwen3-Embedding-0.6B - **Format**: ONNX with external data (`model.onnx` + `model.onnx_data`) - **Pooling**: Mean pooling (built into the ONNX graph) - **Embedding Dimension**: 1024 - **Max Sequence Length**: 32768 tokens ## Usage with TEI ```bash docker run --gpus all -p 8080:80 -v $PWD:/data \ ghcr.io/huggingface/text-embeddings-inference:latest \ --model-id janni-t/qwen3-embedding-0.6b-tei-onnx ``` For CPU inference: ```bash docker run -p 8080:80 -v $PWD:/data \ ghcr.io/huggingface/text-embeddings-inference:cpu-latest \ --model-id janni-t/qwen3-embedding-0.6b-tei-onnx ``` ## Conversion Details This model was converted from the original PyTorch model to ONNX format with: - Consolidated external data for TEI compatibility - Mean pooling integrated into the ONNX graph - Optimized for CPU inference ## Original Model See the original model card at [Qwen/Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B) for: - Model architecture details - Training information - Benchmark results - Citation information ## License Apache 2.0 (same as the original model)