qaihm-bot commited on
Commit
1d0c72d
·
verified ·
1 Parent(s): d5882b5

See https://github.com/quic/ai-hub-models/releases/v0.29.1 for changelog.

README.md CHANGED
@@ -8,7 +8,7 @@ pipeline_tag: unconditional-image-generation
8
 
9
  ---
10
 
11
- ![](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/stable_diffusion_v1_5_w8a16_quantized/web-assets/model_demo.png)
12
 
13
  # Stable-Diffusion-v1.5: Optimized for Mobile Deployment
14
  ## State-of-the-art generative AI model used to generate detailed images conditioned on text descriptions
@@ -21,7 +21,7 @@ This model is an implementation of Stable-Diffusion-v1.5 found [here](https://gi
21
 
22
  This repository provides scripts to run Stable-Diffusion-v1.5 on Qualcomm® devices.
23
  More details on model performance across various devices, can be found
24
- [here](https://aihub.qualcomm.com/models/stable_diffusion_v1_5_w8a16_quantized).
25
 
26
 
27
  ### Model Details
@@ -37,51 +37,51 @@ More details on model performance across various devices, can be found
37
 
38
  | Model | Precision | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit | Target Model
39
  |---|---|---|---|---|---|---|---|---|
40
- | TextEncoderQuantizable | w8a16 | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 9.306 ms | 0 - 9 MB | NPU | Use Export Script |
41
- | TextEncoderQuantizable | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 4.543 ms | 0 - 3 MB | NPU | Use Export Script |
42
- | TextEncoderQuantizable | w8a16 | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 4.915 ms | 0 - 10 MB | NPU | Use Export Script |
43
- | TextEncoderQuantizable | w8a16 | SA7255P ADP | Qualcomm® SA7255P | QNN | 9.306 ms | 0 - 9 MB | NPU | Use Export Script |
44
- | TextEncoderQuantizable | w8a16 | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN | 4.542 ms | 0 - 2 MB | NPU | Use Export Script |
45
- | TextEncoderQuantizable | w8a16 | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN | 4.554 ms | 0 - 2 MB | NPU | Use Export Script |
46
- | TextEncoderQuantizable | w8a16 | SA8775P ADP | Qualcomm® SA8775P | QNN | 4.915 ms | 0 - 10 MB | NPU | Use Export Script |
47
- | TextEncoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 4.528 ms | 0 - 2 MB | NPU | Use Export Script |
48
- | TextEncoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX | 4.711 ms | 0 - 163 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
49
- | TextEncoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 3.254 ms | 0 - 20 MB | NPU | Use Export Script |
50
- | TextEncoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX | 3.245 ms | 0 - 18 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
51
- | TextEncoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN | 3.053 ms | 0 - 14 MB | NPU | Use Export Script |
52
- | TextEncoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX | 3.092 ms | 0 - 18 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
53
- | TextEncoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 4.912 ms | 0 - 0 MB | NPU | Use Export Script |
54
- | TextEncoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 4.805 ms | 157 - 157 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
55
- | UnetQuantizable | w8a16 | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 269.524 ms | 0 - 8 MB | NPU | Use Export Script |
56
- | UnetQuantizable | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 114.547 ms | 0 - 2 MB | NPU | Use Export Script |
57
- | UnetQuantizable | w8a16 | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 108.655 ms | 0 - 8 MB | NPU | Use Export Script |
58
- | UnetQuantizable | w8a16 | SA7255P ADP | Qualcomm® SA7255P | QNN | 269.524 ms | 0 - 8 MB | NPU | Use Export Script |
59
- | UnetQuantizable | w8a16 | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN | 114.686 ms | 0 - 2 MB | NPU | Use Export Script |
60
- | UnetQuantizable | w8a16 | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN | 114.329 ms | 0 - 7 MB | NPU | Use Export Script |
61
- | UnetQuantizable | w8a16 | SA8775P ADP | Qualcomm® SA8775P | QNN | 108.655 ms | 0 - 8 MB | NPU | Use Export Script |
62
- | UnetQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 113.346 ms | 0 - 2 MB | NPU | Use Export Script |
63
- | UnetQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX | 116.134 ms | 0 - 899 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
64
- | UnetQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 80.228 ms | 0 - 18 MB | NPU | Use Export Script |
65
- | UnetQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX | 81.938 ms | 0 - 14 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
66
- | UnetQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN | 71.333 ms | 0 - 14 MB | NPU | Use Export Script |
67
- | UnetQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX | 72.163 ms | 0 - 14 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
68
- | UnetQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 116.583 ms | 0 - 0 MB | NPU | Use Export Script |
69
- | UnetQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 117.204 ms | 842 - 842 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
70
- | VaeDecoderQuantizable | w8a16 | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 720.634 ms | 0 - 10 MB | NPU | Use Export Script |
71
- | VaeDecoderQuantizable | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 270.075 ms | 0 - 3 MB | NPU | Use Export Script |
72
- | VaeDecoderQuantizable | w8a16 | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 250.433 ms | 0 - 12 MB | NPU | Use Export Script |
73
- | VaeDecoderQuantizable | w8a16 | SA7255P ADP | Qualcomm® SA7255P | QNN | 720.634 ms | 0 - 10 MB | NPU | Use Export Script |
74
- | VaeDecoderQuantizable | w8a16 | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN | 270.306 ms | 0 - 3 MB | NPU | Use Export Script |
75
- | VaeDecoderQuantizable | w8a16 | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN | 270.958 ms | 0 - 3 MB | NPU | Use Export Script |
76
- | VaeDecoderQuantizable | w8a16 | SA8775P ADP | Qualcomm® SA8775P | QNN | 250.433 ms | 0 - 12 MB | NPU | Use Export Script |
77
- | VaeDecoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 269.975 ms | 0 - 3 MB | NPU | Use Export Script |
78
- | VaeDecoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX | 270.211 ms | 0 - 66 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
79
- | VaeDecoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 206.896 ms | 0 - 21 MB | NPU | Use Export Script |
80
- | VaeDecoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX | 208.726 ms | 3 - 18 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
81
- | VaeDecoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN | 174.459 ms | 0 - 15 MB | NPU | Use Export Script |
82
- | VaeDecoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX | 171.949 ms | 3 - 17 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
83
- | VaeDecoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 266.518 ms | 0 - 0 MB | NPU | Use Export Script |
84
- | VaeDecoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 267.586 ms | 63 - 63 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
85
 
86
 
87
 
@@ -91,7 +91,7 @@ More details on model performance across various devices, can be found
91
 
92
  Install the package via pip:
93
  ```bash
94
- pip install "qai-hub-models[stable-diffusion-v1-5-w8a16-quantized]"
95
  ```
96
 
97
 
@@ -115,7 +115,7 @@ The package contains a simple end-to-end demo that downloads pre-trained
115
  weights and runs this model on a sample input.
116
 
117
  ```bash
118
- python -m qai_hub_models.models.stable_diffusion_v1_5_w8a16_quantized.demo
119
  ```
120
 
121
  The above demo runs a reference implementation of pre-processing, model
@@ -124,7 +124,7 @@ inference, and post processing.
124
  **NOTE**: If you want running in a Jupyter Notebook or Google Colab like
125
  environment, please add the following to your cell (instead of the above).
126
  ```
127
- %run -m qai_hub_models.models.stable_diffusion_v1_5_w8a16_quantized.demo
128
  ```
129
 
130
 
@@ -137,7 +137,7 @@ device. This script does the following:
137
  * Accuracy check between PyTorch and on-device outputs.
138
 
139
  ```bash
140
- python -m qai_hub_models.models.stable_diffusion_v1_5_w8a16_quantized.export
141
  ```
142
  ```
143
  Profiling Results
@@ -145,7 +145,7 @@ Profiling Results
145
  TextEncoderQuantizable
146
  Device : cs_8275 (ANDROID 14)
147
  Runtime : QNN
148
- Estimated inference time (ms) : 9.3
149
  Estimated peak memory usage (MB): [0, 9]
150
  Total # Ops : 533
151
  Compute Unit(s) : npu (533 ops) gpu (0 ops) cpu (0 ops)
@@ -154,7 +154,7 @@ Compute Unit(s) : npu (533 ops) gpu (0 ops) cpu (0 ops)
154
  UnetQuantizable
155
  Device : cs_8275 (ANDROID 14)
156
  Runtime : QNN
157
- Estimated inference time (ms) : 269.5
158
  Estimated peak memory usage (MB): [0, 8]
159
  Total # Ops : 4041
160
  Compute Unit(s) : npu (4041 ops) gpu (0 ops) cpu (0 ops)
@@ -164,7 +164,7 @@ VaeDecoderQuantizable
164
  Device : cs_8275 (ANDROID 14)
165
  Runtime : QNN
166
  Estimated inference time (ms) : 720.6
167
- Estimated peak memory usage (MB): [0, 10]
168
  Total # Ops : 189
169
  Compute Unit(s) : npu (189 ops) gpu (0 ops) cpu (0 ops)
170
  ```
@@ -188,7 +188,7 @@ provides instructions on how to use the `.so` shared library in an Android appl
188
 
189
 
190
  ## View on Qualcomm® AI Hub
191
- Get more details on Stable-Diffusion-v1.5's performance across various devices [here](https://aihub.qualcomm.com/models/stable_diffusion_v1_5_w8a16_quantized).
192
  Explore all available models on [Qualcomm® AI Hub](https://aihub.qualcomm.com/)
193
 
194
 
 
8
 
9
  ---
10
 
11
+ ![](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/stable_diffusion_v1_5/web-assets/model_demo.png)
12
 
13
  # Stable-Diffusion-v1.5: Optimized for Mobile Deployment
14
  ## State-of-the-art generative AI model used to generate detailed images conditioned on text descriptions
 
21
 
22
  This repository provides scripts to run Stable-Diffusion-v1.5 on Qualcomm® devices.
23
  More details on model performance across various devices, can be found
24
+ [here](https://aihub.qualcomm.com/models/stable_diffusion_v1_5).
25
 
26
 
27
  ### Model Details
 
37
 
38
  | Model | Precision | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit | Target Model
39
  |---|---|---|---|---|---|---|---|---|
40
+ | TextEncoderQuantizable | w8a16 | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 9.359 ms | 0 - 9 MB | NPU | Use Export Script |
41
+ | TextEncoderQuantizable | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 4.49 ms | 0 - 3 MB | NPU | Use Export Script |
42
+ | TextEncoderQuantizable | w8a16 | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 4.954 ms | 0 - 10 MB | NPU | Use Export Script |
43
+ | TextEncoderQuantizable | w8a16 | SA7255P ADP | Qualcomm® SA7255P | QNN | 9.359 ms | 0 - 9 MB | NPU | Use Export Script |
44
+ | TextEncoderQuantizable | w8a16 | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN | 4.541 ms | 0 - 2 MB | NPU | Use Export Script |
45
+ | TextEncoderQuantizable | w8a16 | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN | 4.619 ms | 0 - 2 MB | NPU | Use Export Script |
46
+ | TextEncoderQuantizable | w8a16 | SA8775P ADP | Qualcomm® SA8775P | QNN | 4.954 ms | 0 - 10 MB | NPU | Use Export Script |
47
+ | TextEncoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 4.56 ms | 0 - 10 MB | NPU | Use Export Script |
48
+ | TextEncoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX | 4.728 ms | 0 - 164 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
49
+ | TextEncoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 3.271 ms | 0 - 18 MB | NPU | Use Export Script |
50
+ | TextEncoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX | 3.346 ms | 0 - 14 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
51
+ | TextEncoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN | 3.046 ms | 0 - 14 MB | NPU | Use Export Script |
52
+ | TextEncoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX | 3.189 ms | 0 - 14 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
53
+ | TextEncoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 4.891 ms | 1 - 1 MB | NPU | Use Export Script |
54
+ | TextEncoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 4.915 ms | 157 - 157 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
55
+ | UnetQuantizable | w8a16 | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 269.398 ms | 0 - 8 MB | NPU | Use Export Script |
56
+ | UnetQuantizable | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 114.529 ms | 0 - 2 MB | NPU | Use Export Script |
57
+ | UnetQuantizable | w8a16 | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 108.714 ms | 0 - 8 MB | NPU | Use Export Script |
58
+ | UnetQuantizable | w8a16 | SA7255P ADP | Qualcomm® SA7255P | QNN | 269.398 ms | 0 - 8 MB | NPU | Use Export Script |
59
+ | UnetQuantizable | w8a16 | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN | 114.507 ms | 1 - 3 MB | NPU | Use Export Script |
60
+ | UnetQuantizable | w8a16 | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN | 113.487 ms | 0 - 2 MB | NPU | Use Export Script |
61
+ | UnetQuantizable | w8a16 | SA8775P ADP | Qualcomm® SA8775P | QNN | 108.714 ms | 0 - 8 MB | NPU | Use Export Script |
62
+ | UnetQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 114.344 ms | 0 - 2 MB | NPU | Use Export Script |
63
+ | UnetQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX | 112.155 ms | 0 - 4 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
64
+ | UnetQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 81.714 ms | 0 - 19 MB | NPU | Use Export Script |
65
+ | UnetQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX | 79.459 ms | 0 - 14 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
66
+ | UnetQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN | 71.239 ms | 0 - 14 MB | NPU | Use Export Script |
67
+ | UnetQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX | 71.488 ms | 0 - 15 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
68
+ | UnetQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 116.593 ms | 0 - 0 MB | NPU | Use Export Script |
69
+ | UnetQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 114.443 ms | 842 - 842 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
70
+ | VaeDecoderQuantizable | w8a16 | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 720.65 ms | 0 - 9 MB | NPU | Use Export Script |
71
+ | VaeDecoderQuantizable | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 268.706 ms | 0 - 3 MB | NPU | Use Export Script |
72
+ | VaeDecoderQuantizable | w8a16 | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 250.387 ms | 0 - 12 MB | NPU | Use Export Script |
73
+ | VaeDecoderQuantizable | w8a16 | SA7255P ADP | Qualcomm® SA7255P | QNN | 720.65 ms | 0 - 9 MB | NPU | Use Export Script |
74
+ | VaeDecoderQuantizable | w8a16 | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN | 273.815 ms | 0 - 2 MB | NPU | Use Export Script |
75
+ | VaeDecoderQuantizable | w8a16 | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN | 274.195 ms | 0 - 2 MB | NPU | Use Export Script |
76
+ | VaeDecoderQuantizable | w8a16 | SA8775P ADP | Qualcomm® SA8775P | QNN | 250.387 ms | 0 - 12 MB | NPU | Use Export Script |
77
+ | VaeDecoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 270.703 ms | 0 - 3 MB | NPU | Use Export Script |
78
+ | VaeDecoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX | 268.632 ms | 0 - 66 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
79
+ | VaeDecoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 205.905 ms | 0 - 21 MB | NPU | Use Export Script |
80
+ | VaeDecoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX | 206.342 ms | 3 - 23 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
81
+ | VaeDecoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN | 192.889 ms | 0 - 15 MB | NPU | Use Export Script |
82
+ | VaeDecoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX | 175.944 ms | 3 - 17 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
83
+ | VaeDecoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 266.828 ms | 0 - 0 MB | NPU | Use Export Script |
84
+ | VaeDecoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 264.883 ms | 63 - 63 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
85
 
86
 
87
 
 
91
 
92
  Install the package via pip:
93
  ```bash
94
+ pip install "qai-hub-models[stable-diffusion-v1-5]"
95
  ```
96
 
97
 
 
115
  weights and runs this model on a sample input.
116
 
117
  ```bash
118
+ python -m qai_hub_models.models.stable_diffusion_v1_5.demo
119
  ```
120
 
121
  The above demo runs a reference implementation of pre-processing, model
 
124
  **NOTE**: If you want running in a Jupyter Notebook or Google Colab like
125
  environment, please add the following to your cell (instead of the above).
126
  ```
127
+ %run -m qai_hub_models.models.stable_diffusion_v1_5.demo
128
  ```
129
 
130
 
 
137
  * Accuracy check between PyTorch and on-device outputs.
138
 
139
  ```bash
140
+ python -m qai_hub_models.models.stable_diffusion_v1_5.export
141
  ```
142
  ```
143
  Profiling Results
 
145
  TextEncoderQuantizable
146
  Device : cs_8275 (ANDROID 14)
147
  Runtime : QNN
148
+ Estimated inference time (ms) : 9.4
149
  Estimated peak memory usage (MB): [0, 9]
150
  Total # Ops : 533
151
  Compute Unit(s) : npu (533 ops) gpu (0 ops) cpu (0 ops)
 
154
  UnetQuantizable
155
  Device : cs_8275 (ANDROID 14)
156
  Runtime : QNN
157
+ Estimated inference time (ms) : 269.4
158
  Estimated peak memory usage (MB): [0, 8]
159
  Total # Ops : 4041
160
  Compute Unit(s) : npu (4041 ops) gpu (0 ops) cpu (0 ops)
 
164
  Device : cs_8275 (ANDROID 14)
165
  Runtime : QNN
166
  Estimated inference time (ms) : 720.6
167
+ Estimated peak memory usage (MB): [0, 9]
168
  Total # Ops : 189
169
  Compute Unit(s) : npu (189 ops) gpu (0 ops) cpu (0 ops)
170
  ```
 
188
 
189
 
190
  ## View on Qualcomm® AI Hub
191
+ Get more details on Stable-Diffusion-v1.5's performance across various devices [here](https://aihub.qualcomm.com/models/stable_diffusion_v1_5).
192
  Explore all available models on [Qualcomm® AI Hub](https://aihub.qualcomm.com/)
193
 
194
 
TextEncoder.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:5ea609803056cc46b35aaf7db04e7091a2cdeee823e64bbd569faf594b7e6e8b
3
- size 163545088
 
 
 
 
TextEncoderQuantizable.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:9ed3b67ad0b0725f72b42427afef780be75fdfd138b874cb891e2af34dcbac8e
3
- size 163545088
 
 
 
 
TextEncoderQuantizable_w8a16.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:d311113834583b852501aee90ffbb25a35f128fc43fb712600d65c674f974040
3
- size 163548336
 
 
 
 
TextEncoderQuantizable_w8a16.onnx.zip DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:3cf1a0900cd118efd4d11f5067ba79a8f99d8995b5c11947d6b5228154086857
3
- size 127241529
 
 
 
 
TextEncoder_Quantized.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:aad7cc2d5c4ae1ceb59264d47880c109bdc963aa1d0841d47dfcd34032556abe
3
- size 163275152
 
 
 
 
UNet_Quantized.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:e7523141556997cc2e6b4a1bacc0dc59b38b05fd18aae8c64004987d05f0eb7e
3
- size 878473240
 
 
 
 
Unet.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:a8057f09a165388abfdbfc1520983ff368bf58dd5abf0fd29affafbee68e3e1b
3
- size 879088632
 
 
 
 
UnetQuantizable.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:a8acc84d9be477334dc746e4d9c7ac94ec82a0aa538fb186a1452f4a377f2bec
3
- size 879088632
 
 
 
 
UnetQuantizable_w8a16.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:3b0f06fd2f9fb9d3ec1e5a0b5d46242eab182a187a70f45b6639023338ac2e1e
3
- size 881209680
 
 
 
 
VAEDecoder_Quantized.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:7789b6a8b8aa6ae02f20f2817b54b410c45f0fddee9cf231cf3aac83724f8975
3
- size 59072424
 
 
 
 
VaeDecoder.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:9a2d7e70ba95a2749d73f9785233d25ebf5abb1e34351d87c0f1c9e0adb00d49
3
- size 64693320
 
 
 
 
VaeDecoderQuantizable.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:8b613581e3a9ff71c1637918dce76ba296d55abc8807f0faef12556ed60525d3
3
- size 64693320
 
 
 
 
VaeDecoderQuantizable.so DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:a89f37abd5657cf80a35936d75058bfb964fc36920eaa0d66bc9b2fe37822d83
3
- size 50386176
 
 
 
 
VaeDecoderQuantizable_w8a16.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:c818562619cfc7622ab57fb139afc9033df650f4049d8d2b9443210e5a7b7846
3
- size 64701512