timm
/

Zero-Shot Image Classification
OpenCLIP
Safetensors
siglip
siglip2
vision

ViT-SO400M-16-SigLIP2-384 infer failed use the code of ReadMe

#1
by ctgushiwei - opened

I use the follow code to infer the model of ViT-SO400M-16-SigLIP2-384,the following issues may arise:

image.png

import torch
import torch.nn.functional as F
from urllib.request import urlopen
from PIL import Image
from open_clip import create_model_from_pretrained, get_tokenizer # works on open-clip-torch >= 2.31.0, timm >= 1.0.15

model, preprocess = create_model_from_pretrained('hf-hub:timm/ViT-SO400M-16-SigLIP2-384')
tokenizer = get_tokenizer('hf-hub:timm/ViT-SO400M-16-SigLIP2-384')

image = Image.open(urlopen(
'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))
image = preprocess(image).unsqueeze(0)

labels_list = ["a dog", "a cat", "a donut", "a beignet"]
text = tokenizer(labels_list, context_length=model.context_length)

with torch.no_grad(), torch.cuda.amp.autocast():
image_features = model.encode_image(image, normalize=True)
text_features = model.encode_text(text, normalize=True)
text_probs = torch.sigmoid(image_features @ text_features.T * model.logit_scale.exp() + model.logit_bias)

zipped_list = list(zip(labels_list, [100 * round(p.item(), 3) for p in text_probs[0]]))
print("Label probabilities: ", zipped_list)

open_clip.version '2.31.0'
timm 1.0.15

Hi! you need to update transformers library to the latest version: pip install transformers==4.51.0

ok ,tank you ! I have solved this issue

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment