---
license: apache-2.0
datasets:
- HuggingFaceH4/ultrachat_200k
language:
- en
- es
base_model:
- Qwen/Qwen3-Embedding-0.6B
pipeline_tag: feature-extraction
---

# prudant/Qwen3-Embedding-0.6B-W8A8

This is a compressed version of Qwen/Qwen3-Embedding-0.6B using llm-compressor with the following scheme: W8A8

**Important**: You MUST read the following guide for correct usage of this model here [Guide](https://github.com/vllm-project/vllm/pull/19260)

## Model Details

- **Original Model**: Qwen/Qwen3-Embedding-0.6B
- **Quantization Method**: GPTQ
- **Compression Libraries**: [llm-compressor](https://github.com/vllm-project/llm-compressor)
- **Calibration Dataset**: ultrachat_200k (1024 samples)
- **Optimized For**: Inference with vLLM
- **License**: same as original model