πŸ‘— StyleFinder – AI-Powered Fashion Visual Search

StyleFinder is a deep learning-based image retrieval system fine-tuned on the DeepFashion In-shop Clothes dataset using CLIP. It enables users to upload an image and retrieve visually similar fashion items using both zero-shot and fine-tuned CLIP variants.


🧠 Supported Models

Model Stage Description
ViT-B/16 Stage 3 v4 Best fine-tuned transformer-based model
RN50 Stage 3 v3 Best fine-tuned CNN-based model
ViT-B/16 Zero-shot Official OpenAI pretrained CLIP
RN50 Zero-shot Official OpenAI pretrained CLIP

πŸ“Š Evaluation Results

Metric ViT-B/16 (v4) RN50 (v3)
Rank-1 46.24% 53.95%
mAP 0.3481 0.4265

πŸ–ΌοΈ Precomputed Gallery Features

Gallery embeddings are stored as .pt files for fast cosine similarity search.

File Name Description
vitb16_stage3_v4_gallery.pt Fine-tuned ViT-B/16 gallery
rn50_stage3_v3_gallery.pt Fine-tuned RN50 gallery
vitb16_zeroshot_gallery.pt Official CLIP ViT-B/16 gallery
rn50_zeroshot_gallery.pt Official CLIP RN50 gallery

These are stored in the gallery_features/ directory and can be loaded with load_gallery_features().


βš™οΈ How to Use

πŸ”Ή Load a Model

from model_loader import load_model
model, preprocess = load_model(arch="vitb16", stage="stage3")  # or rn50 / zeroshot
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Space using babupallam/stylefinder 1

Evaluation results

  • Rank-1 Accuracy (RN50) on DeepFashion In-shop Clothes Retrieval
    self-reported
    53.950
  • mAP (RN50) on DeepFashion In-shop Clothes Retrieval
    self-reported
    0.426
  • Rank-1 Accuracy (ViT-B/16) on DeepFashion In-shop Clothes Retrieval
    self-reported
    46.240
  • mAP (ViT-B/16) on DeepFashion In-shop Clothes Retrieval
    self-reported
    0.348