πŸ” SigLIP Person Search - Open Set

This model is a fine-tuned version of google/siglip-base-patch16-224 for open-set person retrieval based on natural language descriptions. It's built to support image-text similarity in real-world retail and surveillance scenarios.

🧠 Use Case

This model allows you to search for people in crowded environments (like malls or stores) using only a text prompt, for example:

"A man wearing a white t-shirt and carrying a brown shoulder bag"

The model will return person crops that match the description.

πŸ’Ύ Training

  • Base: google/siglip-base-patch16-224
  • Loss: Cosine InfoNCE
  • Data: ReID dataset with multimodal attributes (generated via Gemini)
  • Epochs: 10
  • Usage: Retrieval-style search (not classification)

πŸ“ˆ Intended Use

  • Smart surveillance
  • Anonymous retail behavior tracking
  • Human-in-the-loop retrieval
  • Visual search & retrieval systems

πŸ”§ How to Use

from transformers import AutoProcessor, AutoModel
import torch

processor = AutoProcessor.from_pretrained("adonaivera/siglip-person-search-openset")
model = AutoModel.from_pretrained("adonaivera/siglip-person-search-openset")

text = "A man wearing a white t-shirt and carrying a brown shoulder bag"
inputs = processor(text=text, return_tensors="pt")
with torch.no_grad():
    text_features = model.get_text_features(**inputs)

πŸ“Œ Notes

  • This model is optimized for feature extraction and cosine similarity matching
  • It's not meant for classification or image generation
  • Similarity threshold tuning is required depending on your application
Downloads last month
11
Safetensors
Model size
203M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Space using adonaivera/siglip-person-search-openset 1