Update README.md
Browse files
README.md
CHANGED
@@ -28,65 +28,75 @@ tags:
|
|
28 |
- offline
|
29 |
- privacy
|
30 |
- fast
|
|
|
|
|
|
|
31 |
---
|
32 |
|
33 |
# MobileNetV2 — ONNX, Quantized
|
34 |
|
35 |
-
### 🔥
|
36 |
-
- **`document`** (
|
37 |
-
- **`photo`** (
|
38 |
|
39 |
---
|
40 |
|
41 |
-
## 🟢
|
42 |
|
43 |
-
-
|
44 |
-
-
|
45 |
-
-
|
46 |
-
-
|
47 |
-
-
|
48 |
-
-
|
49 |
-
-
|
50 |
-
-
|
51 |
|
52 |
---
|
53 |
|
54 |
-
## 🏷️
|
55 |
- **0** — `document`
|
56 |
- **1** — `photo`
|
57 |
|
58 |
---
|
59 |
|
60 |
-
## ⚡️
|
61 |
|
62 |
-
- `mobilenetv2_doc_photo.onnx` —
|
63 |
-
- `mobilenetv2_doc_photo_quant.onnx` —
|
64 |
|
65 |
---
|
66 |
|
67 |
-
## 🚀
|
68 |
|
69 |
-
-
|
70 |
-
-
|
71 |
-
-
|
72 |
|
73 |
---
|
74 |
|
75 |
-
## 🗃️
|
76 |
|
77 |
-
-
|
78 |
-
-
|
79 |
|
80 |
---
|
81 |
|
82 |
-
## 🛠️
|
83 |
|
84 |
```python
|
85 |
import onnxruntime as ort
|
86 |
import numpy as np
|
87 |
|
88 |
session = ort.InferenceSession("mobilenetv2_doc_photo_quant.onnx")
|
89 |
-
img = np.random.randn(1, 3, 224, 224).astype(np.float32) #
|
90 |
output = session.run(None, {"input": img})
|
91 |
pred_class = np.argmax(output[0])
|
92 |
-
print(pred_class) # 0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
28 |
- offline
|
29 |
- privacy
|
30 |
- fast
|
31 |
+
- android
|
32 |
+
- ios
|
33 |
+
- gallery
|
34 |
---
|
35 |
|
36 |
# MobileNetV2 — ONNX, Quantized
|
37 |
|
38 |
+
### 🔥 Lightweight mobile model for **image classification** into two categories:
|
39 |
+
- **`document`** (scans, receipts, papers, invoices)
|
40 |
+
- **`photo`** (regular phone photos: scenes, people, nature, etc.)
|
41 |
|
42 |
---
|
43 |
|
44 |
+
## 🟢 Overview
|
45 |
|
46 |
+
- **Designed for mobile devices** (phones and tablets, Android/iOS), perfect for real-time on-device inference!
|
47 |
+
- Architecture: **MobileNetV2**
|
48 |
+
- Format: **ONNX** (both float32 and quantized int8 versions included)
|
49 |
+
- Trained on balanced, real-world open-source datasets for both documents and photos.
|
50 |
+
- Ideal for tasks like:
|
51 |
+
- Document detection in gallery/camera rolls
|
52 |
+
- Screenshot, receipt, photo, and PDF preview classification
|
53 |
+
- Image sorting for privacy-first offline AI assistants
|
54 |
|
55 |
---
|
56 |
|
57 |
+
## 🏷️ Model Classes
|
58 |
- **0** — `document`
|
59 |
- **1** — `photo`
|
60 |
|
61 |
---
|
62 |
|
63 |
+
## ⚡️ Versions
|
64 |
|
65 |
+
- `mobilenetv2_doc_photo.onnx` — Standard float32 for maximum accuracy (best for ARM/CPU)
|
66 |
+
- `mobilenetv2_doc_photo_quant.onnx` — Quantized int8 for even faster inference and smaller file size (best for low-power or edge devices)
|
67 |
|
68 |
---
|
69 |
|
70 |
+
## 🚀 Why this model?
|
71 |
|
72 |
+
- **Ultra-small size** (~10-15MB), real-time inference (<100ms) on most phones
|
73 |
+
- **Runs 100% offline** (privacy, no cloud required)
|
74 |
+
- **Easy integration** with any framework, including React Native (`onnxruntime-react-native`), Android (ONNX Runtime), and iOS.
|
75 |
|
76 |
---
|
77 |
|
78 |
+
## 🗃️ Datasets
|
79 |
|
80 |
+
- **Photos:** [alfredplpl/Japanese-photos](https://huggingface.co/datasets/alfredplpl/Japanese-photos)
|
81 |
+
- **Documents:** [3sara/colpali_italian_documents](https://huggingface.co/datasets/3sara/colpali_italian_documents)
|
82 |
|
83 |
---
|
84 |
|
85 |
+
## 🛠️ Usage Example
|
86 |
|
87 |
```python
|
88 |
import onnxruntime as ort
|
89 |
import numpy as np
|
90 |
|
91 |
session = ort.InferenceSession("mobilenetv2_doc_photo_quant.onnx")
|
92 |
+
img = np.random.randn(1, 3, 224, 224).astype(np.float32) # Replace with your image preprocessing!
|
93 |
output = session.run(None, {"input": img})
|
94 |
pred_class = np.argmax(output[0])
|
95 |
+
print(pred_class) # 0 = document, 1 = photo
|
96 |
+
|
97 |
+
---
|
98 |
+
|
99 |
+
## 🤖 Author
|
100 |
+
@vlad-m-dev
|
101 |
+
Built for edge-ai/phone/tablet offline image classification: document vs photo
|
102 |
+
Telegram: https://t.me/dwight_schrute_engineer
|