Inference on intel iris xe integrated gpus seems to be broken.

by AdiSkan - opened 25 days ago

25 days ago

•

Issue: Inferencing this model on the cpu works fine but not on the iGPU. Not sure if I need to do anything else or am doing something wrong.

CPU: Intel core i7-1335U, iGPU: Intel Iris Xe

Code:
import openvino_genai as ov_genai

device = "GPU"
pipe = ov_genai.LLMPipeline(model_path, device)
pipe.get_tokenizer().set_chat_template(pipe.get_tokenizer().chat_template)
print(pipe.generate("What is OpenVINO?", max_length=200))

Inference result on iGPU:

,SON.TO: RODUCTION::是中国是一 TIN이:YING.AROories:ONEO * IN.SIRODUCTIONRODUCTIONA
.ロasy AND:

ON S exactlyRODUCTIONORODUCTION motivated,INO twoُ important:This NOT IN exactly two long.IN
YES importantONOasy스.IIAI

� 是中国 ANDO S I. NO ANDC I. IN ...

NOTOROSIS是中国是中国:

INSON I

ANDories ONT NOING I.ITYories IN does.TO IN

ONDMETHOD时期的

ANDO

isING,ANI、INOINO ANDIN

NOT,INO

ON S:,,
ON是一 I:TH!., IN ON,, NO ON AND IN IINO,,

vinsblack

25 days ago

Grazie per le tue osservazioni, ho ottimizzato il mio dataset secondo le tue indicazioni. ti chiedo di provarlo e di inviarmi i tuoi feedback, ovviamnete questo è un formato di prova , il datset reale è composto da circa 1.4 Tb di dati di alta qualità.

AdiSkan

24 days ago

Hi @vinsblack , thanks for responding but the problem is that the inference result is correct on the CPU but not on the iGPU with the latest version of Openvino_genai, I am not sure that this is related to the dataset?

vinsblack

24 days ago

effettivamente Openvivo_GenAi ha un pò di problemi con processori intel a causa della loro grafica interna, non hai menzionato alcuna scheda grafica di tipo nvidia, msi, ecc. Questi sono i suggerimenti che puoi provare:

Abilitare la grafica integrata nel BIOS. Nel menu del BIOS, l'impostazione era in Chipset > grafica interna > [Abilitato].
Seguire la guida alla configurazione della grafica del processore Intel.
molto spesso per ovviare a questo problema si ottimizza il dataset con esempi di qualità il maniera tale che il modello pre-addestrato "inganni " la libreria in oggetto. ad esempio il mio dataset ha questi requisiti ecco perchè l' ho menzionato

AdiSkan

24 days ago

Ah I understand, I will try your suggestion. Thank you @vinsblack .

Echo9Zulu

OpenVINO Toolkit org 22 days ago

@AdiSkan

Whats up!

You can test out my quant before going deeper: https://huggingface.co/Echo9Zulu/Qwen3-0.6B-int8_asym-ov/tree/main

When applying weight only compression with int8_asym no calibration dataset is used. Additionally, runtime info tags from openvino_model.xml do not report a dataset was used which is consistent with default conversion behavior. Is there a different usecase for the dataset mentioned by @vinsblack ?

<rt_info>
        <Runtime_version value="2025.1.0-18503-6fec06580ab-releases/2025/1" />
        <conversion_parameters>
            <framework value="pytorch" />
            <is_python_object value="True" />
        </conversion_parameters>
        <nncf>
            <friendly_names_were_updated value="True" />
            <weight_compression>
                <advanced_parameters value="{'statistics_path': None, 'awq_params': {'subset_size': 32, 'percent_to_apply': 0.002, 'alpha_min': 0.0, 'alpha_max': 1.0, 'steps': 100}, 'scale_estimation_params': {'subset_size': 64, 'initial_steps': 5, 'scale_steps': 5, 'weight_penalty': -1.0}, 'gptq_params': {'damp_percent': 0.1, 'block_size': 128, 'subset_size': 128}, 'lora_correction_params': {'adapter_rank': 8, 'num_iterations': 3, 'apply_regularization': True, 'subset_size': 128, 'use_int8_adapters': True}}" />
                <all_layers value="False" />
                <awq value="False" />
                <backup_mode value="int8_asym" />
                <gptq value="False" />
                <group_size value="-1" />
                <ignored_scope value="[]" />
                <lora_correction value="False" />
                <mode value="int8_asym" />
                <ratio value="1.0" />
                <scale_estimation value="False" />
                <sensitivity_metric value="weight_quantization_error" />
            </weight_compression>
        </nncf>
        <optimum>
            <nncf_version value="2.15.0" />
            <optimum_intel_version value="1.23.0.dev0+6b993b8" />
            <optimum_version value="1.25.0.dev0" />
            <pytorch_version value="2.7.0+cpu" />
            <transformers_version value="4.51.3" />
        </optimum>
        <runtime_options>
            <ACTIVATIONS_SCALE_FACTOR value="8.0" />
        </runtime_options>
    </rt_info>

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment