--- license: apache-2.0 datasets: - HuggingFaceH4/ultrachat_200k language: - en - es base_model: - Qwen/Qwen3-Embedding-0.6B pipeline_tag: feature-extraction --- # prudant/Qwen3-Embedding-0.6B-W8A8 This is a compressed version of Qwen/Qwen3-Embedding-0.6B using llm-compressor with the following scheme: W8A8 **Important**: You MUST read the following guide for correct usage of this model here [Guide](https://github.com/vllm-project/vllm/pull/19260) ## Model Details - **Original Model**: Qwen/Qwen3-Embedding-0.6B - **Quantization Method**: GPTQ - **Compression Libraries**: [llm-compressor](https://github.com/vllm-project/llm-compressor) - **Calibration Dataset**: ultrachat_200k (1024 samples) - **Optimized For**: Inference with vLLM - **License**: same as original model