File size: 783 Bytes
c11530c
 
 
 
 
 
 
 
 
 
 
0b4f016
 
 
 
 
e4f23fb
 
0b4f016
 
 
 
 
 
 
c11530c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
---
license: apache-2.0
datasets:
- HuggingFaceH4/ultrachat_200k
language:
- en
- es
base_model:
- Qwen/Qwen3-Embedding-0.6B
pipeline_tag: feature-extraction
---

# prudant/Qwen3-Embedding-0.6B-W8A8

This is a compressed version of Qwen/Qwen3-Embedding-0.6B using llm-compressor with the following scheme: W8A8

**Important**: You MUST read the following guide for correct usage of this model here [Guide](https://github.com/vllm-project/vllm/pull/19260)

## Model Details

- **Original Model**: Qwen/Qwen3-Embedding-0.6B
- **Quantization Method**: GPTQ
- **Compression Libraries**: [llm-compressor](https://github.com/vllm-project/llm-compressor)
- **Calibration Dataset**: ultrachat_200k (1024 samples)
- **Optimized For**: Inference with vLLM
- **License**: same as original model