v0.29.1
Browse filesSee https://github.com/quic/ai-hub-models/releases/v0.29.1 for changelog.
- README.md +55 -55
- TextEncoder.bin +0 -3
- TextEncoderQuantizable.bin +0 -3
- TextEncoderQuantizable_w8a16.bin +0 -3
- TextEncoderQuantizable_w8a16.onnx.zip +0 -3
- TextEncoder_Quantized.bin +0 -3
- UNet_Quantized.bin +0 -3
- Unet.bin +0 -3
- UnetQuantizable.bin +0 -3
- UnetQuantizable_w8a16.bin +0 -3
- VAEDecoder_Quantized.bin +0 -3
- VaeDecoder.bin +0 -3
- VaeDecoderQuantizable.bin +0 -3
- VaeDecoderQuantizable.so +0 -3
- VaeDecoderQuantizable_w8a16.bin +0 -3
README.md
CHANGED
@@ -8,7 +8,7 @@ pipeline_tag: unconditional-image-generation
|
|
8 |
|
9 |
---
|
10 |
|
11 |
-
 | Peak Memory Range (MB) | Primary Compute Unit | Target Model
|
39 |
|---|---|---|---|---|---|---|---|---|
|
40 |
-
| TextEncoderQuantizable | w8a16 | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 9.
|
41 |
-
| TextEncoderQuantizable | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 4.
|
42 |
-
| TextEncoderQuantizable | w8a16 | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 4.
|
43 |
-
| TextEncoderQuantizable | w8a16 | SA7255P ADP | Qualcomm® SA7255P | QNN | 9.
|
44 |
-
| TextEncoderQuantizable | w8a16 | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN | 4.
|
45 |
-
| TextEncoderQuantizable | w8a16 | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN | 4.
|
46 |
-
| TextEncoderQuantizable | w8a16 | SA8775P ADP | Qualcomm® SA8775P | QNN | 4.
|
47 |
-
| TextEncoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 4.
|
48 |
-
| TextEncoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX | 4.
|
49 |
-
| TextEncoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 3.
|
50 |
-
| TextEncoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX | 3.
|
51 |
-
| TextEncoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN | 3.
|
52 |
-
| TextEncoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX | 3.
|
53 |
-
| TextEncoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 4.
|
54 |
-
| TextEncoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 4.
|
55 |
-
| UnetQuantizable | w8a16 | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 269.
|
56 |
-
| UnetQuantizable | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 114.
|
57 |
-
| UnetQuantizable | w8a16 | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 108.
|
58 |
-
| UnetQuantizable | w8a16 | SA7255P ADP | Qualcomm® SA7255P | QNN | 269.
|
59 |
-
| UnetQuantizable | w8a16 | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN | 114.
|
60 |
-
| UnetQuantizable | w8a16 | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN |
|
61 |
-
| UnetQuantizable | w8a16 | SA8775P ADP | Qualcomm® SA8775P | QNN | 108.
|
62 |
-
| UnetQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN |
|
63 |
-
| UnetQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX |
|
64 |
-
| UnetQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN |
|
65 |
-
| UnetQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX |
|
66 |
-
| UnetQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN | 71.
|
67 |
-
| UnetQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX |
|
68 |
-
| UnetQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 116.
|
69 |
-
| UnetQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX |
|
70 |
-
| VaeDecoderQuantizable | w8a16 | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 720.
|
71 |
-
| VaeDecoderQuantizable | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN |
|
72 |
-
| VaeDecoderQuantizable | w8a16 | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 250.
|
73 |
-
| VaeDecoderQuantizable | w8a16 | SA7255P ADP | Qualcomm® SA7255P | QNN | 720.
|
74 |
-
| VaeDecoderQuantizable | w8a16 | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN |
|
75 |
-
| VaeDecoderQuantizable | w8a16 | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN |
|
76 |
-
| VaeDecoderQuantizable | w8a16 | SA8775P ADP | Qualcomm® SA8775P | QNN | 250.
|
77 |
-
| VaeDecoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN |
|
78 |
-
| VaeDecoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX |
|
79 |
-
| VaeDecoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN |
|
80 |
-
| VaeDecoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX |
|
81 |
-
| VaeDecoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN |
|
82 |
-
| VaeDecoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX |
|
83 |
-
| VaeDecoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 266.
|
84 |
-
| VaeDecoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX |
|
85 |
|
86 |
|
87 |
|
@@ -91,7 +91,7 @@ More details on model performance across various devices, can be found
|
|
91 |
|
92 |
Install the package via pip:
|
93 |
```bash
|
94 |
-
pip install "qai-hub-models[stable-diffusion-v1-5
|
95 |
```
|
96 |
|
97 |
|
@@ -115,7 +115,7 @@ The package contains a simple end-to-end demo that downloads pre-trained
|
|
115 |
weights and runs this model on a sample input.
|
116 |
|
117 |
```bash
|
118 |
-
python -m qai_hub_models.models.
|
119 |
```
|
120 |
|
121 |
The above demo runs a reference implementation of pre-processing, model
|
@@ -124,7 +124,7 @@ inference, and post processing.
|
|
124 |
**NOTE**: If you want running in a Jupyter Notebook or Google Colab like
|
125 |
environment, please add the following to your cell (instead of the above).
|
126 |
```
|
127 |
-
%run -m qai_hub_models.models.
|
128 |
```
|
129 |
|
130 |
|
@@ -137,7 +137,7 @@ device. This script does the following:
|
|
137 |
* Accuracy check between PyTorch and on-device outputs.
|
138 |
|
139 |
```bash
|
140 |
-
python -m qai_hub_models.models.
|
141 |
```
|
142 |
```
|
143 |
Profiling Results
|
@@ -145,7 +145,7 @@ Profiling Results
|
|
145 |
TextEncoderQuantizable
|
146 |
Device : cs_8275 (ANDROID 14)
|
147 |
Runtime : QNN
|
148 |
-
Estimated inference time (ms) : 9.
|
149 |
Estimated peak memory usage (MB): [0, 9]
|
150 |
Total # Ops : 533
|
151 |
Compute Unit(s) : npu (533 ops) gpu (0 ops) cpu (0 ops)
|
@@ -154,7 +154,7 @@ Compute Unit(s) : npu (533 ops) gpu (0 ops) cpu (0 ops)
|
|
154 |
UnetQuantizable
|
155 |
Device : cs_8275 (ANDROID 14)
|
156 |
Runtime : QNN
|
157 |
-
Estimated inference time (ms) : 269.
|
158 |
Estimated peak memory usage (MB): [0, 8]
|
159 |
Total # Ops : 4041
|
160 |
Compute Unit(s) : npu (4041 ops) gpu (0 ops) cpu (0 ops)
|
@@ -164,7 +164,7 @@ VaeDecoderQuantizable
|
|
164 |
Device : cs_8275 (ANDROID 14)
|
165 |
Runtime : QNN
|
166 |
Estimated inference time (ms) : 720.6
|
167 |
-
Estimated peak memory usage (MB): [0,
|
168 |
Total # Ops : 189
|
169 |
Compute Unit(s) : npu (189 ops) gpu (0 ops) cpu (0 ops)
|
170 |
```
|
@@ -188,7 +188,7 @@ provides instructions on how to use the `.so` shared library in an Android appl
|
|
188 |
|
189 |
|
190 |
## View on Qualcomm® AI Hub
|
191 |
-
Get more details on Stable-Diffusion-v1.5's performance across various devices [here](https://aihub.qualcomm.com/models/
|
192 |
Explore all available models on [Qualcomm® AI Hub](https://aihub.qualcomm.com/)
|
193 |
|
194 |
|
|
|
8 |
|
9 |
---
|
10 |
|
11 |
+

|
12 |
|
13 |
# Stable-Diffusion-v1.5: Optimized for Mobile Deployment
|
14 |
## State-of-the-art generative AI model used to generate detailed images conditioned on text descriptions
|
|
|
21 |
|
22 |
This repository provides scripts to run Stable-Diffusion-v1.5 on Qualcomm® devices.
|
23 |
More details on model performance across various devices, can be found
|
24 |
+
[here](https://aihub.qualcomm.com/models/stable_diffusion_v1_5).
|
25 |
|
26 |
|
27 |
### Model Details
|
|
|
37 |
|
38 |
| Model | Precision | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit | Target Model
|
39 |
|---|---|---|---|---|---|---|---|---|
|
40 |
+
| TextEncoderQuantizable | w8a16 | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 9.359 ms | 0 - 9 MB | NPU | Use Export Script |
|
41 |
+
| TextEncoderQuantizable | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 4.49 ms | 0 - 3 MB | NPU | Use Export Script |
|
42 |
+
| TextEncoderQuantizable | w8a16 | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 4.954 ms | 0 - 10 MB | NPU | Use Export Script |
|
43 |
+
| TextEncoderQuantizable | w8a16 | SA7255P ADP | Qualcomm® SA7255P | QNN | 9.359 ms | 0 - 9 MB | NPU | Use Export Script |
|
44 |
+
| TextEncoderQuantizable | w8a16 | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN | 4.541 ms | 0 - 2 MB | NPU | Use Export Script |
|
45 |
+
| TextEncoderQuantizable | w8a16 | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN | 4.619 ms | 0 - 2 MB | NPU | Use Export Script |
|
46 |
+
| TextEncoderQuantizable | w8a16 | SA8775P ADP | Qualcomm® SA8775P | QNN | 4.954 ms | 0 - 10 MB | NPU | Use Export Script |
|
47 |
+
| TextEncoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 4.56 ms | 0 - 10 MB | NPU | Use Export Script |
|
48 |
+
| TextEncoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX | 4.728 ms | 0 - 164 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
|
49 |
+
| TextEncoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 3.271 ms | 0 - 18 MB | NPU | Use Export Script |
|
50 |
+
| TextEncoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX | 3.346 ms | 0 - 14 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
|
51 |
+
| TextEncoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN | 3.046 ms | 0 - 14 MB | NPU | Use Export Script |
|
52 |
+
| TextEncoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX | 3.189 ms | 0 - 14 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
|
53 |
+
| TextEncoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 4.891 ms | 1 - 1 MB | NPU | Use Export Script |
|
54 |
+
| TextEncoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 4.915 ms | 157 - 157 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
|
55 |
+
| UnetQuantizable | w8a16 | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 269.398 ms | 0 - 8 MB | NPU | Use Export Script |
|
56 |
+
| UnetQuantizable | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 114.529 ms | 0 - 2 MB | NPU | Use Export Script |
|
57 |
+
| UnetQuantizable | w8a16 | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 108.714 ms | 0 - 8 MB | NPU | Use Export Script |
|
58 |
+
| UnetQuantizable | w8a16 | SA7255P ADP | Qualcomm® SA7255P | QNN | 269.398 ms | 0 - 8 MB | NPU | Use Export Script |
|
59 |
+
| UnetQuantizable | w8a16 | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN | 114.507 ms | 1 - 3 MB | NPU | Use Export Script |
|
60 |
+
| UnetQuantizable | w8a16 | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN | 113.487 ms | 0 - 2 MB | NPU | Use Export Script |
|
61 |
+
| UnetQuantizable | w8a16 | SA8775P ADP | Qualcomm® SA8775P | QNN | 108.714 ms | 0 - 8 MB | NPU | Use Export Script |
|
62 |
+
| UnetQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 114.344 ms | 0 - 2 MB | NPU | Use Export Script |
|
63 |
+
| UnetQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX | 112.155 ms | 0 - 4 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
|
64 |
+
| UnetQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 81.714 ms | 0 - 19 MB | NPU | Use Export Script |
|
65 |
+
| UnetQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX | 79.459 ms | 0 - 14 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
|
66 |
+
| UnetQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN | 71.239 ms | 0 - 14 MB | NPU | Use Export Script |
|
67 |
+
| UnetQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX | 71.488 ms | 0 - 15 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
|
68 |
+
| UnetQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 116.593 ms | 0 - 0 MB | NPU | Use Export Script |
|
69 |
+
| UnetQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 114.443 ms | 842 - 842 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
|
70 |
+
| VaeDecoderQuantizable | w8a16 | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 720.65 ms | 0 - 9 MB | NPU | Use Export Script |
|
71 |
+
| VaeDecoderQuantizable | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 268.706 ms | 0 - 3 MB | NPU | Use Export Script |
|
72 |
+
| VaeDecoderQuantizable | w8a16 | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 250.387 ms | 0 - 12 MB | NPU | Use Export Script |
|
73 |
+
| VaeDecoderQuantizable | w8a16 | SA7255P ADP | Qualcomm® SA7255P | QNN | 720.65 ms | 0 - 9 MB | NPU | Use Export Script |
|
74 |
+
| VaeDecoderQuantizable | w8a16 | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN | 273.815 ms | 0 - 2 MB | NPU | Use Export Script |
|
75 |
+
| VaeDecoderQuantizable | w8a16 | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN | 274.195 ms | 0 - 2 MB | NPU | Use Export Script |
|
76 |
+
| VaeDecoderQuantizable | w8a16 | SA8775P ADP | Qualcomm® SA8775P | QNN | 250.387 ms | 0 - 12 MB | NPU | Use Export Script |
|
77 |
+
| VaeDecoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 270.703 ms | 0 - 3 MB | NPU | Use Export Script |
|
78 |
+
| VaeDecoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX | 268.632 ms | 0 - 66 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
|
79 |
+
| VaeDecoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 205.905 ms | 0 - 21 MB | NPU | Use Export Script |
|
80 |
+
| VaeDecoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX | 206.342 ms | 3 - 23 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
|
81 |
+
| VaeDecoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN | 192.889 ms | 0 - 15 MB | NPU | Use Export Script |
|
82 |
+
| VaeDecoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX | 175.944 ms | 3 - 17 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
|
83 |
+
| VaeDecoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 266.828 ms | 0 - 0 MB | NPU | Use Export Script |
|
84 |
+
| VaeDecoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 264.883 ms | 63 - 63 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
|
85 |
|
86 |
|
87 |
|
|
|
91 |
|
92 |
Install the package via pip:
|
93 |
```bash
|
94 |
+
pip install "qai-hub-models[stable-diffusion-v1-5]"
|
95 |
```
|
96 |
|
97 |
|
|
|
115 |
weights and runs this model on a sample input.
|
116 |
|
117 |
```bash
|
118 |
+
python -m qai_hub_models.models.stable_diffusion_v1_5.demo
|
119 |
```
|
120 |
|
121 |
The above demo runs a reference implementation of pre-processing, model
|
|
|
124 |
**NOTE**: If you want running in a Jupyter Notebook or Google Colab like
|
125 |
environment, please add the following to your cell (instead of the above).
|
126 |
```
|
127 |
+
%run -m qai_hub_models.models.stable_diffusion_v1_5.demo
|
128 |
```
|
129 |
|
130 |
|
|
|
137 |
* Accuracy check between PyTorch and on-device outputs.
|
138 |
|
139 |
```bash
|
140 |
+
python -m qai_hub_models.models.stable_diffusion_v1_5.export
|
141 |
```
|
142 |
```
|
143 |
Profiling Results
|
|
|
145 |
TextEncoderQuantizable
|
146 |
Device : cs_8275 (ANDROID 14)
|
147 |
Runtime : QNN
|
148 |
+
Estimated inference time (ms) : 9.4
|
149 |
Estimated peak memory usage (MB): [0, 9]
|
150 |
Total # Ops : 533
|
151 |
Compute Unit(s) : npu (533 ops) gpu (0 ops) cpu (0 ops)
|
|
|
154 |
UnetQuantizable
|
155 |
Device : cs_8275 (ANDROID 14)
|
156 |
Runtime : QNN
|
157 |
+
Estimated inference time (ms) : 269.4
|
158 |
Estimated peak memory usage (MB): [0, 8]
|
159 |
Total # Ops : 4041
|
160 |
Compute Unit(s) : npu (4041 ops) gpu (0 ops) cpu (0 ops)
|
|
|
164 |
Device : cs_8275 (ANDROID 14)
|
165 |
Runtime : QNN
|
166 |
Estimated inference time (ms) : 720.6
|
167 |
+
Estimated peak memory usage (MB): [0, 9]
|
168 |
Total # Ops : 189
|
169 |
Compute Unit(s) : npu (189 ops) gpu (0 ops) cpu (0 ops)
|
170 |
```
|
|
|
188 |
|
189 |
|
190 |
## View on Qualcomm® AI Hub
|
191 |
+
Get more details on Stable-Diffusion-v1.5's performance across various devices [here](https://aihub.qualcomm.com/models/stable_diffusion_v1_5).
|
192 |
Explore all available models on [Qualcomm® AI Hub](https://aihub.qualcomm.com/)
|
193 |
|
194 |
|
TextEncoder.bin
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:5ea609803056cc46b35aaf7db04e7091a2cdeee823e64bbd569faf594b7e6e8b
|
3 |
-
size 163545088
|
|
|
|
|
|
|
|
TextEncoderQuantizable.bin
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:9ed3b67ad0b0725f72b42427afef780be75fdfd138b874cb891e2af34dcbac8e
|
3 |
-
size 163545088
|
|
|
|
|
|
|
|
TextEncoderQuantizable_w8a16.bin
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:d311113834583b852501aee90ffbb25a35f128fc43fb712600d65c674f974040
|
3 |
-
size 163548336
|
|
|
|
|
|
|
|
TextEncoderQuantizable_w8a16.onnx.zip
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:3cf1a0900cd118efd4d11f5067ba79a8f99d8995b5c11947d6b5228154086857
|
3 |
-
size 127241529
|
|
|
|
|
|
|
|
TextEncoder_Quantized.bin
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:aad7cc2d5c4ae1ceb59264d47880c109bdc963aa1d0841d47dfcd34032556abe
|
3 |
-
size 163275152
|
|
|
|
|
|
|
|
UNet_Quantized.bin
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:e7523141556997cc2e6b4a1bacc0dc59b38b05fd18aae8c64004987d05f0eb7e
|
3 |
-
size 878473240
|
|
|
|
|
|
|
|
Unet.bin
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:a8057f09a165388abfdbfc1520983ff368bf58dd5abf0fd29affafbee68e3e1b
|
3 |
-
size 879088632
|
|
|
|
|
|
|
|
UnetQuantizable.bin
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:a8acc84d9be477334dc746e4d9c7ac94ec82a0aa538fb186a1452f4a377f2bec
|
3 |
-
size 879088632
|
|
|
|
|
|
|
|
UnetQuantizable_w8a16.bin
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:3b0f06fd2f9fb9d3ec1e5a0b5d46242eab182a187a70f45b6639023338ac2e1e
|
3 |
-
size 881209680
|
|
|
|
|
|
|
|
VAEDecoder_Quantized.bin
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:7789b6a8b8aa6ae02f20f2817b54b410c45f0fddee9cf231cf3aac83724f8975
|
3 |
-
size 59072424
|
|
|
|
|
|
|
|
VaeDecoder.bin
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:9a2d7e70ba95a2749d73f9785233d25ebf5abb1e34351d87c0f1c9e0adb00d49
|
3 |
-
size 64693320
|
|
|
|
|
|
|
|
VaeDecoderQuantizable.bin
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:8b613581e3a9ff71c1637918dce76ba296d55abc8807f0faef12556ed60525d3
|
3 |
-
size 64693320
|
|
|
|
|
|
|
|
VaeDecoderQuantizable.so
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:a89f37abd5657cf80a35936d75058bfb964fc36920eaa0d66bc9b2fe37822d83
|
3 |
-
size 50386176
|
|
|
|
|
|
|
|
VaeDecoderQuantizable_w8a16.bin
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:c818562619cfc7622ab57fb139afc9033df650f4049d8d2b9443210e5a7b7846
|
3 |
-
size 64701512
|
|
|
|
|
|
|
|