YOLOv5s Compressed with AIminify
Overview
This repository provides a compressed version of YOLOv5s using AIminify. The original YOLOv5s model was pretrained on the COCO dataset by Ultralytics. Leveraging AIminify’s pruning and fine-tuning strategies, we reduced the model size and FLOPS while maintaining strong performance on COCO benchmarks.
Key features
- Pruned to remove unneeded parameters and reduce computational overhead.
- Fine-tuned on the COCO dataset post-compression to restore or retain high accuracy.
- Minimal performance Loss across various compression strengths, preserving mAP in most scenarios.
Model architecture
The base architecture is YOLOv5s from Ultralytics. Modifications include:
- Pruning certain channels/kernels based on AIminify’s pruning algorithm.
- Automatic Fine-Tuning post-pruning to recover performance.
Despite its reduced size, this model maintains similar detection capabilities for common objects as the original YOLOv5s.
Intended use
- Primary use case: General object detection (people, vehicles, animals, etc.) in images and videos.
- Industries: Could be applied in retail (store analytics), security, robotics, autonomous vehicles, or any scenario where a fast, lightweight detector is beneficial.
- Resource-constrained environments: Ideal for devices or deployments where GPU/CPU resources are limited or when high throughput is required.
Limitations
- Dataset bias: Trained on COCO, which may not generalize to highly domain-specific use cases (e.g., medical imaging, satellite imagery). Additional domain-specific fine-tuning might be necessary.
- Performance variations: Depending on the chosen compression strength, there might be a slight reduction in accuracy relative to the uncompressed YOLOv5s model.
Metrics and Performance
Below is a summary of performance across different compression strengths on COCO’s mAP50-95 metric, FLOPS, number of parameters, and model size:
Compression Strength | GFLOPS | Parameters | Model Size (MB) | mAP50-95 |
---|---|---|---|---|
0 (baseline) | 24.0 | ~9.1M | 36.8 | 0.412 |
1 | 22.3 | ~8.6M | 34.8 | 0.406 |
2 | 21.1 | ~8.3M | 33.3 | 0.400 |
3 | 19.9 | ~7.9M | 31.7 | 0.394 |
4 | 18.6 | ~7.5M | 30.1 | 0.377 |
5 | 17.3 | ~7.0M | 28.4 | 0.365 |
Even at the highest compression strength (5
), the model significantly outperforms a smaller YOLOv5n baseline while being more resource-efficient than the original YOLOv5s.
How to use
import torch
if __name__ == '__main__':
model_file = 'YOLOv5_compression_strength_5_unquantized.pt'
model = torch.load(model_file, map_location='cpu')
model.eval()
inputs = torch.randn(1, 3, 640, 640)
results = model(inputs)