helenai
/

Qwen2.5-VL-3B-Instruct-ov-int4

Model card Files Files and versions

helenai commited on Sep 4

Commit

67dc411

·

verified ·

1 Parent(s): 86452f2

Create README.md

Files changed (1) hide show

README.md +85 -0

README.md ADDED Viewed

	@@ -0,0 +1,85 @@

+---
+base_model:
+- Qwen/Qwen2.5-VL-3B-Instruct
+---
+This is the [Qwen/Qwen2.5-VL-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct) model, converted to OpenVINO, with int4 weights for the language model, int8 weights for the other models.
+## Download Model
+To download the model, run `pip install huggingface-hub[cli]` and then:
+```
+huggingface-cli download helenai/Qwen2.5-VL-3B-Instruct-ov-int4 --local-dir Qwen2.5-VL-3B-Instruct-ov-int4
+```
+## Run inference with OpenVINO GenAI
+Use OpenVINO GenAI to run inference on this model. This model works with OpenVINO GenAI 2025.2 and later.
+- Install OpenVINO GenAI and pillow:
+```
+pip install --upgrade openvino-genai pillow
+```
+- Download a test image: `curl -O "https://storage.openvinotoolkit.org/test_data/images/dog.jpg"`
+- Run inference:
+```python
+import numpy as np
+import openvino as ov
+import openvino_genai
+from PIL import Image
+# Choose GPU instead of CPU in the line below to run the model on Intel integrated or discrete GPU
+pipe = openvino_genai.VLMPipeline("Qwen2.5-VL-3B-Instruct-ov-int4", "CPU")
+image = Image.open("dog.jpg")
+# optional: resizing to a smaller size (depending on image and prompt) is often useful to speed up inference.
+image = image.resize((128, 128))
+image_data = np.array(image.getdata()).reshape(1, image.size[1], image.size[0], 3).astype(np.uint8)
+image_data = ov.Tensor(image_data)
+prompt = "Can you describe the image?"
+result = pipe.generate(prompt, image=image_data, max_new_tokens=100)
+print(result.texts[0])
+```
+See [OpenVINO GenAI repository](https://github.com/openvinotoolkit/openvino.genai?tab=readme-ov-file#performing-visual-language-text-generation)
+## Model export properties
+Model export command:
+```
+optimum-cli export openvino -m Qwen/Qwen2.5-VL-3B-Instruct --weight-format int4 Qwen2.5-VL-3B-Instruct-ov-int4
+```
+### Framework versions
+```
+openvino         : 2025.2.0-19140-c01cd93e24d-releases/2025/2
+nncf             : 2.17.0.dev0+c6296072
+optimum_intel    : 1.26.0.dev0+0e2ccef
+optimum          : 1.27.0
+pytorch          : 2.7.0+cpu
+transformers     : 4.51.3
+```
+### LLM export properties
+```
+all_layers               : False
+awq                      : False
+backup_mode              : int8_asym
+compression_format       : dequantize
+gptq                     : False
+group_size               : 128
+ignored_scope            : []
+lora_correction          : False
+mode                     : int4_asym
+ratio                    : 1.0
+scale_estimation         : False
+sensitivity_metric       : weight_quantization_error
+```