omoured's picture
update readme
9c1f63c

A newer version of the Gradio SDK is available: 5.49.1

Upgrade
metadata
title: Fashion Search Engine
emoji: πŸ›οΈ
colorFrom: pink
colorTo: purple
sdk: gradio
sdk_version: 4.29.0
app_file: app.py
pinned: false
license: apache-2.0

πŸ›οΈ Fashion Search Engine (Image + Text)

This project provides an efficient way to search fashion products using either an image or a textual description. Users can search either by uploading an image or entering a descriptive text query, and the system will return visually or semantically similar fashion items.

Powered by OpenAI’s CLIP ViT-B/32 model and accelerated using ONNX and FAISS for real-time retrieval.


Image Query Example Text Query Example

Example UI: Left - Image-based Search, Right - Text-based Search


🧠 Model Details

To accelerate inference, we export both the visual and text encoders to ONNX format. Our benchmark results (test_onnx.py) demonstrate a ~32Γ— speedup using ONNX Runtime compared to the original PyTorch models.

  • Model: ViT-B/32 (OpenAI CLIP)
  • Backends:
    • Image encoder β†’ ONNX
    • Text encoder β†’ ONNX
  • Inference engine: onnxruntime
  • Indexing: FAISS with L2-normalized vectors
  • Benchmark: ~32x speedup (measured on CPU using test_onnx.py)

πŸ› οΈ Installation & Setup

1. Environment Setup

conda create -n product-match python=3.10
conda activate product-match
pip install -r requirements.txt

Make sure MongoDB is running locally at mongodb://localhost:27017 before continuing.


2. πŸ—‚οΈ Dataset Preparation

To experminet with this system we used E-commerce Product Images dataset from Kaggle. Run the following scripts to prepare the fashion dataset:

# Download and structure the dataset
python get_dataset.py
# Augment product image path to the fashion.csv
python update_csv.py
Dataset Cover

Example samples from the Kaggle E-commerce Product Images dataset


3. 🧾 Generate Embeddings

From the app/faiss/ directory:

# Generate CLIP text embeddings from product descriptions
python generate_text_embeddings.py

# Generate CLIP image embeddings from product images
python generate_visual_embeddings.py

These scripts will output .csv embedding files under data/.


4. 🧠 Build FAISS Index

Navigate to the app/faiss/ directory and run the following script to build indexes for fast similarity search:

python build_faiss_index.py

This script will generate:

  • faiss_image.index – FAISS index for image embeddings
  • faiss_text.index – FAISS index for text embeddings
  • image_id_to_meta.pkl – metadata mapping for image results
  • text_id_to_meta.pkl – metadata mapping for text results

These files are required for the search engine to return relevant product matches.


5. πŸ—ƒοΈ MongoDB Setup

Set up the MongoDB database for logging inference queries and results:

cd app/db/
python mongo_setup.py

This script will:

  • Connect to mongodb://localhost:27017
  • Create a database named product_matching
  • Initialize a collection called logs

This collection will automatically store:

  • Input query details (text or image)
  • Top matching results with metadata
  • Any runtime errors encountered during inference

⚠️ Make sure MongoDB is installed and running locally before executing this step.

Image Query Example Text Query Example

Screenshots from the database logs and products.

You can monitor logs using a MongoDB GUI like MongoDB Compass or via shell:

mongo
use product_matching
db.logs.find().pretty()

6. πŸ§ͺ Launch the Gradio Demo UI

After preparing the dataset, embeddings, FAISS indexes, and MongoDB, you can launch the interactive demo:

python app/ui/gradio_search.py

Once the script runs, Gradio will start a local web server and display a URL. You're now ready to explore and experiment with multi-modal product search. 🎯


πŸ“„ References & Licensing

This project was developed as part of Omar Moured's job application for a position at Sereact.ai.

The code, data processing scripts, and UI implementation provided in this repository are not intended for public distribution or reuse.

All content is protected under a custom restricted-use license. You may not copy, distribute, modify, or use any portion of this codebase without explicit written permission from the author.