Phi-3.5-vision-instruct-int8-ov

Description

This is microsoft/Phi-3.5-vision-instruct model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT8 by NNCF.

Quantization Parameters

Weight compression was performed using nncf.compress_weights with the following parameters:

  • mode: INT8_ASYM

Compatibility

The provided OpenVINO™ IR model is compatible with:

  • OpenVINO version 2025.0.0 and higher
  • Optimum Intel 1.21.0 and higher

Running Model Inference with Optimum Intel

  1. Install packages required for using Optimum Intel integration with the OpenVINO backend:
pip install --pre -U --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release openvino_tokenizers openvino

pip install git+https://github.com/huggingface/optimum-intel.git
  1. Run model inference
from PIL import Image 
import requests 
from optimum.intel.openvino import OVModelForVisualCausalLM
from transformers import AutoProcessor, TextStreamer

model_id = "OpenVINO/Phi-3.5-vision-instruct-int8-ov"

processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)

ov_model = OVModelForVisualCausalLM.from_pretrained(model_id, trust_remote_code=True)
prompt = "<|image_1|>\nWhat is unusual on this picture?"

url = "https://github.com/openvinotoolkit/openvino_notebooks/assets/29454499/d5fbbd1a-d484-415c-88cb-9986625b7b11"
image = Image.open(requests.get(url, stream=True).raw)

inputs = ov_model.preprocess_inputs(text=prompt, image=image, processor=processor)

generation_args = { 
    "max_new_tokens": 50, 
    "temperature": 0.0, 
    "do_sample": False,
    "streamer": TextStreamer(processor.tokenizer, skip_prompt=True, skip_special_tokens=True)
} 

generate_ids = ov_model.generate(**inputs, 
  eos_token_id=processor.tokenizer.eos_token_id, 
  **generation_args
)

generate_ids = generate_ids[:, inputs['input_ids'].shape[1]:]
response = processor.batch_decode(generate_ids, 
  skip_special_tokens=True, 
  clean_up_tokenization_spaces=False)[0]

Limitations

Check the original model card for limitations.

Legal information

The original model is distributed under MIT license. More details can be found in original model card.

Downloads last month
37
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for OpenVINO/Phi-3.5-vision-instruct-int8-ov

Quantized
(12)
this model

Collection including OpenVINO/Phi-3.5-vision-instruct-int8-ov