Spaces:

omoured
/

fashion-search-engine

Sleeping

App Files Files Community

omoured commited on Aug 4

Commit

6307e13

1 Parent(s): 63db507

Initial commit with LFS tracking

Browse files

Files changed (12) hide show

.gitattributes +2 -0
README.md +156 -11
app.py +141 -0
examples/2697.jpg +3 -0
examples/3150.jpg +3 -0
faiss/faiss_image.index +3 -0
faiss/faiss_text.index +3 -0
faiss/image_id_to_meta.pkl +3 -0
faiss/text_id_to_meta.pkl +3 -0
models/clip_text_encoder.onnx +3 -0
models/clip_vitb32.onnx +3 -0
requirements.txt +6 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+*.index filter=lfs diff=lfs merge=lfs -text
+*.jpg filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,14 +1,159 @@
 ---
-title: Fashion Search Engine
-emoji: 🏢
-colorFrom: purple
-colorTo: blue
-sdk: gradio
-sdk_version: 5.39.0
-app_file: app.py
-pinned: false
-license: cc-by-nc-nd-4.0
-short_description: About AI-powered fashion product search engine
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# 🛍️ Fashion Search Engine (Image + Text)
+This project provides an efficient way to search fashion products using either an image or a textual description. Users can search either by uploading an image or entering a descriptive text query, and the system will return visually or semantically similar fashion items.
+Powered by **OpenAI’s CLIP ViT-B/32** model and accelerated using ONNX and FAISS for real-time retrieval.
+---
+<div align="center">
+  <img src="misc/image-query.png" alt="Image Query Example" width="45%" style="margin-right: 2%;">
+  <img src="misc/text-query.png" alt="Text Query Example" width="45%">
+</div>
+<p align="center"><em>Example UI: Left - Image-based Search, Right - Text-based Search</em></p>
+---
+## 🧠 Model Details
+To accelerate inference, we export both the **visual** and **text** encoders to **ONNX** format. Our benchmark results (`test_onnx.py`) demonstrate a **~32× speedup** using ONNX Runtime compared to the original PyTorch models.
+- **Model:** `ViT-B/32` (OpenAI CLIP)
+- **Backends:**
+  - Image encoder → ONNX
+  - Text encoder → ONNX
+- **Inference engine:** `onnxruntime`
+- **Indexing:** `FAISS` with L2-normalized vectors
+- **Benchmark:** ~32x speedup (measured on CPU using `test_onnx.py`)
+---
+## 🛠️ Installation & Setup
+### 1. Environment Setup
+```bash
+conda create -n product-match python=3.10
+conda activate product-match
+pip install -r requirements.txt
+```
+Make sure MongoDB is running locally at `mongodb://localhost:27017` before continuing.
+---
+### 2. 🗂️ Dataset Preparation
+To experminet with this system we used [E-commerce Product Images ](https://www.kaggle.com/datasets/vikashrajluhaniwal/fashion-images)dataset from Kaggle. Run the following scripts to prepare the fashion dataset:
+```bash
+# Download and structure the dataset
+python get_dataset.py
+# Augment product image path to the fashion.csv
+python update_csv.py
+```
+<div align="center">
+  <img src="misc/dataset-cover.png" alt="Dataset Cover" width="70%">
+</div>
+<p align="center"><em>Example samples from the Kaggle E-commerce Product Images dataset</em></p>
 ---
+### 3. 🧾 Generate Embeddings
+From the `app/faiss/` directory:
+```bash
+# Generate CLIP text embeddings from product descriptions
+python generate_text_embeddings.py
+# Generate CLIP image embeddings from product images
+python generate_visual_embeddings.py
+```
+These scripts will output `.csv` embedding files under `data/`.
 ---
+### 4. 🧠 Build FAISS Index
+Navigate to the `app/faiss/` directory and run the following script to build indexes for fast similarity search:
+```bash
+python build_faiss_index.py
+```
+This script will generate:
+* `faiss_image.index` – FAISS index for image embeddings
+* `faiss_text.index` – FAISS index for text embeddings
+* `image_id_to_meta.pkl` – metadata mapping for image results
+* `text_id_to_meta.pkl` – metadata mapping for text results
+These files are required for the search engine to return relevant product matches.
+---
+### 5. 🗃️ MongoDB Setup
+Set up the MongoDB database for logging inference queries and results:
+```bash
+cd app/db/
+python mongo_setup.py
+```
+This script will:
+* Connect to `mongodb://localhost:27017`
+* Create a database named `product_matching`
+* Initialize a collection called `logs`
+This collection will automatically store:
+* Input query details (text or image)
+* Top matching results with metadata
+* Any runtime errors encountered during inference
+⚠️ Make sure MongoDB is installed and running locally before executing this step.
+<div align="center">
+  <img src="misc/db_products.png" alt="Image Query Example" width="45%" style="margin-right: 2%;">
+  <img src="misc/db_logs.png" alt="Text Query Example" width="45%">
+</div>
+<p align="center"><em>Screenshots from the database logs and products.</em></p>
+You can monitor logs using a MongoDB GUI like MongoDB Compass or via shell:
+```bash
+mongo
+use product_matching
+db.logs.find().pretty()
+```
+---
+### 6. 🧪 Launch the Gradio Demo UI
+After preparing the dataset, embeddings, FAISS indexes, and MongoDB, you can launch the interactive demo:
+```bash
+python app/ui/gradio_search.py
+```
+Once the script runs, Gradio will start a local web server and display a URL. You're now ready to explore and experiment with multi-modal product search. 🎯
+---
+## 📄 References & Licensing
+This project was developed as part of **Omar Moured's job application** for a position at [Sereact.ai](https://sereact.ai/).
+The code, data processing scripts, and UI implementation provided in this repository are **not intended for public distribution or reuse**.
+All content is protected under a **custom restricted-use license**. You may **not copy, distribute, modify, or use any portion of this codebase** without **explicit written permission** from the author.

app.py ADDED Viewed

	@@ -0,0 +1,141 @@

+import os
+import gradio as gr
+import onnxruntime as ort
+import numpy as np
+import faiss
+import pickle
+import tempfile
+from PIL import Image
+from torchvision import transforms
+from transformers import CLIPTokenizer
+# === Config ===
+TOP_K = 3
+IMG_ONNX_PATH = "models/clip_vitb32.onnx"
+TXT_ONNX_PATH = "models/clip_text_encoder.onnx"
+IMG_INDEX_PATH = "faiss/faiss_image.index"
+TXT_INDEX_PATH = "faiss/faiss_text.index"
+IMG_META_PATH = "faiss/image_id_to_meta.pkl"
+TXT_META_PATH = "faiss/text_id_to_meta.pkl"
+# === Load models and index ===
+img_session = ort.InferenceSession(IMG_ONNX_PATH)
+txt_session = ort.InferenceSession(TXT_ONNX_PATH)
+img_input_name = img_session.get_inputs()[0].name
+txt_input_name = txt_session.get_inputs()[0].name
+img_index = faiss.read_index(IMG_INDEX_PATH)
+txt_index = faiss.read_index(TXT_INDEX_PATH)
+with open(IMG_META_PATH, "rb") as f:
+    img_meta = pickle.load(f)
+img_meta = list(img_meta.items())
+with open(TXT_META_PATH, "rb") as f:
+    txt_meta = pickle.load(f)
+txt_meta = list(txt_meta.items())
+tokenizer = CLIPTokenizer.from_pretrained("openai/clip-vit-base-patch32")
+# === Preprocessing ===
+transform = transforms.Compose([
+    transforms.Resize((224, 224)),
+    transforms.ToTensor(),
+    transforms.Normalize([0.5]*3, [0.5]*3)
+])
+def search(input_img, input_text):
+    top_results = []
+    input_text_clean = input_text.strip() if isinstance(input_text, str) else ""
+    tmp_path = None
+    try:
+        real_img = isinstance(input_img, Image.Image)
+        has_text = input_text_clean != ""
+        if not real_img and not has_text:
+            return [], "❌ Please upload an image or type a query."
+        output_images = []
+        captions = []
+        if real_img:
+            image = input_img.convert("RGB")
+            tensor = transform(image).unsqueeze(0).numpy()
+            embedding = img_session.run(None, {img_input_name: tensor})[0]
+            embedding = embedding / np.linalg.norm(embedding, axis=1, keepdims=True)
+            scores, indices = img_index.search(embedding.astype(np.float32), TOP_K)
+            meta_list = img_meta
+        else:
+            query = f"Looking for a {input_text_clean}"
+            inputs = tokenizer(query, padding="max_length", max_length=77, return_tensors="np")
+            token_ids = inputs["input_ids"].astype(np.int64)
+            embedding = txt_session.run(None, {txt_input_name: token_ids})[0]
+            embedding = embedding / np.linalg.norm(embedding, axis=1, keepdims=True)
+            scores, indices = txt_index.search(embedding.astype(np.float32), TOP_K)
+            meta_list = txt_meta
+        for score, idx in zip(scores[0], indices[0]):
+            if idx == -1:
+                continue
+            try:
+                match_id, meta = meta_list[idx]
+            except Exception:
+                continue
+            img_path = meta.get("image_path")
+            if not img_path or not os.path.isfile(img_path):
+                continue
+            image = Image.open(img_path).convert("RGB")
+            caption = "\n".join([
+                f"🆔 ID: {match_id}",
+                f"🎨 Color: {meta.get('color', 'N/A')}",
+                f"👗 Product Type: {meta.get('product_type', 'N/A')}",
+                f"🚻 Gender: {meta.get('gender', 'N/A')}",
+                f"🛍️ Usage: {meta.get('usage', 'N/A')}",
+                f"📦 Category: {meta.get('category', 'N/A')}",
+                f"📈 Score: {score:.3f}"
+            ])
+            output_images.append(image)
+            captions.append(caption)
+            top_results.append({
+                "match_id": match_id,
+                "score": float(score),
+                "metadata": meta,
+                "image_path": img_path
+            })
+        if not output_images:
+            return [], "⚠️ No matching results found."
+        return output_images, "\n\n".join(captions)
+    except Exception as e:
+        return [], f"❌ Error: {str(e)}"
+# === Gradio UI ===
+iface = gr.Interface(
+    fn=search,
+    inputs=[
+        gr.Image(type="pil", label="Upload Image (optional)", height=224),
+        gr.Textbox(label="Text Query (optional)", placeholder="e.g., red cotton top for girls")
+    ],
+    outputs=[
+        gr.Gallery(label="Top 3 Matches", columns=3, height=300),
+        gr.Textbox(label="Result Details")
+    ],
+    title="🛍️ Find your Fashion with Text or Image",
+    description="Upload a product image or enter a description to find similar fashion items.",
+    examples=[
+        ["examples/2697.jpg", ""],
+        ["examples/3150.jpg", ""],
+        [None, "blue denim jeans"],
+        [None, "white floral dress for summer"]
+    ]
+)
+iface.launch()

examples/2697.jpg ADDED Viewed

Git LFS Details

SHA256: 4de74ef9240846c99b52ac26fedac59a685b312fe24e80d79b5ce59a9228a84b
Pointer size: 130 Bytes
Size of remote file: 16 kB

examples/3150.jpg ADDED Viewed

Git LFS Details

SHA256: 785731e77deb13b5270481176bac3d4e70999df129674c2ab31fd2687d96648a
Pointer size: 131 Bytes
Size of remote file: 155 kB

faiss/faiss_image.index ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:35b4996c35317ad8e58c85f1f92065c4e18056d880c1a4b8b96d77d7a2b32944
+size 5951533

faiss/faiss_text.index ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d0c3a658eafba31071f8161e6c0f7183d684f40016e6e2f8b24e00f41dca9dda
+size 5951533

faiss/image_id_to_meta.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6a3941e379637c5ab3e5e1fe2adb3cb793385bd7f41faf9d9bcc2c623f645711
+size 399652

faiss/text_id_to_meta.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2a8419d02942d1275ea2c8eb96d5e40e3bdf196abc7d9212f6ee775fae330721
+size 482793

models/clip_text_encoder.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:846877caaad2fa0a2ad2411c12ba46f01bbc42ca927e3a8e53b3e2c4b678e69f
+size 254433342

models/clip_vitb32.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a0de506b70897532e280e18e7fd271562f54585b9459a8d9ffd59e26fdeb03c3
+size 351530149

requirements.txt ADDED Viewed

	@@ -0,0 +1,6 @@

+gradio
+onnxruntime
+torch
+transformers
+Pillow
+faiss-cpu