Qdrant/clip-ViT-B-32-vision

ONNX port of sentence-transformers/clip-ViT-B-32.

This model is intended to be used for image classification and similarity searches.

Usage

Here's an example of performing inference using the model with FastEmbed.

from fastembed import ImageEmbedding

images = [
    "./path/to/image1.jpg",
    "./path/to/image2.jpg",
]

model = ImageEmbedding(model_name="Qdrant/clip-ViT-B-32-vision")
embeddings = list(model.embed(images))

# [
#   array([-0.1115,  0.0097,  0.0052,  0.0195, ...], dtype=float32),
#   array([-0.1019,  0.0635, -0.0332,  0.0522, ...], dtype=float32)
# ]