File size: 3,222 Bytes
a2bc7c2 82f8418 a2bc7c2 82f8418 7ef9e0e a2bc7c2 7ef9e0e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 |
---
license: apache-2.0
language:
- en
base_model:
- google/siglip2-so400m-patch14-384
pipeline_tag: image-classification
library_name: transformers
tags:
- fashion
- product
- usage
- Casual
- Ethnic
- Formal
---

# **Fashion-Product-Usage**
> **Fashion-Product-Usage** is a vision-language model fine-tuned from **google/siglip2-base-patch16-224** using the **SiglipForImageClassification** architecture. It classifies fashion product images based on their intended usage context.
```py
Classification Report:
precision recall f1-score support
Casual 0.8529 0.9716 0.9084 34392
Ethnic 0.8365 0.7528 0.7925 3208
Formal 0.7246 0.3006 0.4250 2345
Home 0.0000 0.0000 0.0000 1
Party 0.0000 0.0000 0.0000 29
Smart Casual 0.0000 0.0000 0.0000 67
Sports 0.7157 0.1848 0.2938 4004
Travel 0.0000 0.0000 0.0000 26
accuracy 0.8458 44072
macro avg 0.3912 0.2762 0.3024 44072
weighted avg 0.8300 0.8458 0.8159 44072
```
The model predicts one of the following usage categories:
- **0:** Casual
- **1:** Ethnic
- **2:** Formal
- **3:** Home
- **4:** Party
- **5:** Smart Casual
- **6:** Sports
- **7:** Travel
---
# **Run with Transformers 🤗**
```python
!pip install -q transformers torch pillow gradio
```
```python
import gradio as gr
from transformers import AutoImageProcessor, SiglipForImageClassification
from PIL import Image
import torch
# Load model and processor
model_name = "prithivMLmods/Fashion-Product-Usage" # Replace with your actual model path
model = SiglipForImageClassification.from_pretrained(model_name)
processor = AutoImageProcessor.from_pretrained(model_name)
# Label mapping
id2label = {
0: "Casual",
1: "Ethnic",
2: "Formal",
3: "Home",
4: "Party",
5: "Smart Casual",
6: "Sports",
7: "Travel"
}
def classify_usage(image):
"""Predicts the usage type of a fashion product."""
image = Image.fromarray(image).convert("RGB")
inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()
predictions = {id2label[i]: round(probs[i], 3) for i in range(len(probs))}
return predictions
# Gradio interface
iface = gr.Interface(
fn=classify_usage,
inputs=gr.Image(type="numpy"),
outputs=gr.Label(label="Usage Prediction Scores"),
title="Fashion-Product-Usage",
description="Upload a fashion product image to predict its intended usage (Casual, Formal, Party, etc.)."
)
# Launch the app
if __name__ == "__main__":
iface.launch()
```
---
# **Intended Use**
This model can be used for:
- **Product tagging in e-commerce catalogs**
- **Context-aware product recommendations**
- **Fashion search optimization**
- **Data annotation for training recommendation engines** |