timm/ViT-SO400M-16-SigLIP2-384 · ViT-SO400M-16-SigLIP2-384 infer failed use the code of ReadMe

Mar 21

I use the follow code to infer the model of ViT-SO400M-16-SigLIP2-384，the following issues may arise:

import torch
import torch.nn.functional as F
from urllib.request import urlopen
from PIL import Image
from open_clip import create_model_from_pretrained, get_tokenizer # works on open-clip-torch >= 2.31.0, timm >= 1.0.15

model, preprocess = create_model_from_pretrained('hf-hub:timm/ViT-SO400M-16-SigLIP2-384')
tokenizer = get_tokenizer('hf-hub:timm/ViT-SO400M-16-SigLIP2-384')

image = Image.open(urlopen(
'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))
image = preprocess(image).unsqueeze(0)

labels_list = ["a dog", "a cat", "a donut", "a beignet"]
text = tokenizer(labels_list, context_length=model.context_length)

with torch.no_grad(), torch.cuda.amp.autocast():
image_features = model.encode_image(image, normalize=True)
text_features = model.encode_text(text, normalize=True)
text_probs = torch.sigmoid(image_features @ text_features.T * model.logit_scale.exp() + model.logit_bias)

zipped_list = list(zip(labels_list, [100 * round(p.item(), 3) for p in text_probs[0]]))
print("Label probabilities: ", zipped_list)

ctgushiwei

Mar 21

open_clip.version '2.31.0'
timm 1.0.15

Tberriel

Apr 7

Hi! you need to update transformers library to the latest version: pip install transformers==4.51.0

ctgushiwei

Apr 7

ok ,tank you ! I have solved this issue