Squeezenet v1.1
Use case : Image classification
Model description
SqueezeNet is a convolutional neural network that uses design strategies to reduce the number of parameters, particularly with the use of fire modules that "squeeze" parameters using 1x1 convolutions. SqueezeNet 1.1 has 2.4x less computation and slightly fewer parameters than SqueezeNet 1.0, without sacrifying accuracy.
The model is quantized in int8 using tensorflow lite converter.
Network information
Network Information | Value |
---|---|
Framework | TensorFlow Lite |
MParams | 725,061 |
Quantization | int8 |
Provenance | https://github.com/forresti/SqueezeNet |
Paper | https://arxiv.org/pdf/1602.07360.pdf |
The models are quantized using tensorflow lite converter.
Network inputs / outputs
For an image resolution of NxM and P classes
Input Shape | Description |
---|---|
(1, N, M, 3) | Single NxM RGB image with UINT8 values between 0 and 255 |
Output Shape | Description |
---|---|
(1, P) | Per-class confidence for P classes in FLOAT32 |
Recommended Platforms
Platform | Supported | Optimized |
---|---|---|
STM32L0 | [] | [] |
STM32L4 | [x] | [] |
STM32U5 | [x] | [] |
STM32H7 | [x] | [x] |
STM32MP1 | [x] | [] |
STM32MP2 | [x] | [] |
STM32N6 | [x] | [] |
Performances
Metrics
- Measures are done with default STM32Cube.AI configuration with enabled input / output allocated option.
tfs
stands for "training from scratch", meaning that the model weights were randomly initialized before training.
Reference NPU memory footprint on food-101 dataset (see Accuracy for details on dataset)
Model | Format | Resolution | Series | Internal RAM | External RAM | Weights Flash | STM32Cube.AI version | STEdgeAI Core version |
---|---|---|---|---|---|---|---|---|
SqueezeNet v1.1 tfs | Int8 | 128x128x3 | STM32N6 | 270.28 | 0.0 | 772.16 | 10.0.0 | 2.0.0 |
SqueezeNet v1.1 tfs | Int8 | 224x224x3 | STM32N6 | 858.23 | 0.0 | 772.16 | 10.0.0 | 2.0.0 |
Reference NPU inference time on food-101 dataset (see Accuracy for details on dataset)
Model | Format | Resolution | Board | Execution Engine | Inference time (ms) | Inf / sec | STM32Cube.AI version | STEdgeAI Core version |
---|---|---|---|---|---|---|---|---|
SqueezeNet v1.1 tfs | Int8 | 128x128x3 | STM32N6570-DK | NPU/MCU | 3.74 | 267.38 | 10.0.0 | 2.0.0 |
SqueezeNet v1.1 tfs | Int8 | 224x224x3 | STM32N6570-DK | NPU/MCU | 7.75 | 129.03 | 10.0.0 | 2.0.0 |
Reference MCU memory footprint based on Flowers dataset (see Accuracy for details on dataset)
Model | Format | Resolution | Series | Activation RAM | Runtime RAM | Weights Flash | Code Flash | Total RAM | Total Flash | STM32Cube.AI version |
---|---|---|---|---|---|---|---|---|---|---|
SqueezeNet v1.1 tfs | Int8 | 128x128x3 | STM32H7 | 271.84 KiB | 16.47 KiB | 716.71 KiB | 78.24 KiB | 288.31 KiB | 789.55 KiB | 10.0.0 |
SqueezeNet v1.1 tfs | Int8 | 224x224x3 | STM32H7 | 816.86 KiB | 16.51 KiB | 716.71 KiB | 71.42 KiB | 833.37 KiB | 788.13 KiB | 10.0.0 |
Reference MCU inference time based on Flowers dataset (see Accuracy for details on dataset)
Model | Format | Resolution | Board | Execution Engine | Frequency | Inference time (ms) | STM32Cube.AI version |
---|---|---|---|---|---|---|---|
SqueezeNet v1.1 tfs | Int8 | 128x128x3 | STM32H747I-DISCO | 1 CPU | 400 MHz | 216.67 ms | 10.0.0 |
SqueezeNet v1.1 tfs | Int8 | 224x224x3 | STM32H747I-DISCO | 1 CPU | 400 MHz | 693.3 ms | 10.0.0 |
Reference MPU inference time based on Flowers dataset (see Accuracy for details on dataset)
Model | Format | Resolution | Quantization | Board | Execution Engine | Frequency | Inference time (ms) | %NPU | %GPU | %CPU | X-LINUX-AI version | Framework |
---|---|---|---|---|---|---|---|---|---|---|---|---|
SqueezeNet v1.1 tfs | Int8 | 128x128x3 | per-channel** | STM32MP257F-DK2 | NPU/GPU | 800 MHz | 9.72 ms | 8.45 | 91.55 | 0 | v5.1.0 | OpenVX |
SqueezeNet v1.1 tfs | Int8 | 224x224x3 | per-channel** | STM32MP257F-DK2 | NPU/GPU | 800 MHz | 31.11 ms | 8.23 | 91.77 | 0 | v5.1.0 | OpenVX |
SqueezeNet v1.1 tfs | Int8 | 128x128x3 | per-channel | STM32MP157F-DK2 | 2 CPU | 800 MHz | 44.92 ms | NA | NA | 100 | v5.1.0 | TensorFlowLite 2.11.0 |
SqueezeNet v1.1 tfs | Int8 | 224x224x3 | per-channel | STM32MP157F-DK2 | 2 CPU | 800 MHz | 147.80 ms | NA | NA | 100 | v5.1.0 | TensorFlowLite 2.11.0 |
SqueezeNet v1.1 tfs | Int8 | 128x128x3 | per-channel | STM32MP135F-DK2 | 1 CPU | 1000 MHz | 70.16 ms | NA | NA | 100 | v5.1.0 | TensorFlowLite 2.11.0 |
SqueezeNet v1.1 tfs | Int8 | 224x224x3 | per-channel | STM32MP135F-DK2 | 1 CPU | 1000 MHz | 234.80 ms | NA | NA | 100 | v5.1.0 | TensorFlowLite 2.11.0 |
** To get the most out of MP25 NPU hardware acceleration, please use per-tensor quantization
Accuracy with Flowers dataset
Dataset details: link , License CC BY 2.0 , Quotation[1] , Number of classes: 5, Number of images: 3 670
Model | Format | Resolution | Top 1 Accuracy |
---|---|---|---|
SqueezeNet v1.1 tfs | Float | 224x224x3 | 85.29 % |
SqueezeNet v1.1 tfs | Int8 | 224x224x3 | 83.24 % |
SqueezeNet v1.1 tfs | Float | 128x128x3 | 80.93 % |
SqueezeNet v1.1 tfs | Int8 | 128x128x3 | 80.93 % |
Accuracy with Food-101 dataset
Dataset details: link, Number of classes: 101 , Number of images: 101 000
Model | Format | Resolution | Top 1 Accuracy |
---|---|---|---|
SqueezeNet v1.1 tfs | Float | 224x224x3 | 67.15 % |
SqueezeNet v1.1 tfs | Int8 | 224x224x3 | 66.71 % |
SqueezeNet v1.1 tfs | Float | 128x128x3 | 58.55 % |
SqueezeNet v1.1 tfs | Int8 | 128x128x3 | 58.51 % |
Accuracy with Plant-village dataset
Dataset details: link , License CC0 1.0, Quotation[2] , Number of classes: 39, Number of images: 61 486
Model | Format | Resolution | Top 1 Accuracy |
---|---|---|---|
SqueezeNet v1.1 tfs | Float | 224x224x3 | 99.88 % |
SqueezeNet v1.1 tfs | Int8 | 224x224x3 | 99.74 % |
SqueezeNet v1.1 tfs | Float | 128x128x3 | 99.77 % |
SqueezeNet v1.1 tfs | Int8 | 128x128x3 | 99.69 % |
Retraining and Integration in a simple example:
Please refer to the stm32ai-modelzoo-services GitHub here
References
[1] "Tf_flowers : tensorflow datasets," TensorFlow. [Online]. Available: https://www.tensorflow.org/datasets/catalog/tf_flowers.
[2] J, ARUN PANDIAN; GOPAL, GEETHARAMANI (2019), "Data for: Identification of Plant Leaf Diseases Using a 9-layer Deep Convolutional Neural Network", Mendeley Data, V1, doi: 10.17632/tywbtsjrjv.1
[3] L. Bossard, M. Guillaumin, and L. Van Gool, "Food-101 -- Mining Discriminative Components with Random Forests." European Conference on Computer Vision, 2014.