File size: 783 Bytes
c11530c 0b4f016 e4f23fb 0b4f016 c11530c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
---
license: apache-2.0
datasets:
- HuggingFaceH4/ultrachat_200k
language:
- en
- es
base_model:
- Qwen/Qwen3-Embedding-0.6B
pipeline_tag: feature-extraction
---
# prudant/Qwen3-Embedding-0.6B-W8A8
This is a compressed version of Qwen/Qwen3-Embedding-0.6B using llm-compressor with the following scheme: W8A8
**Important**: You MUST read the following guide for correct usage of this model here [Guide](https://github.com/vllm-project/vllm/pull/19260)
## Model Details
- **Original Model**: Qwen/Qwen3-Embedding-0.6B
- **Quantization Method**: GPTQ
- **Compression Libraries**: [llm-compressor](https://github.com/vllm-project/llm-compressor)
- **Calibration Dataset**: ultrachat_200k (1024 samples)
- **Optimized For**: Inference with vLLM
- **License**: same as original model |