YOLOv5s Compressed with AIminify

Overview

This repository provides a compressed version of YOLOv5s using AIminify. The original YOLOv5s model was pretrained on the COCO dataset by Ultralytics. Leveraging AIminify’s pruning and fine-tuning strategies, we reduced the model size and FLOPS while maintaining strong performance on COCO benchmarks.

Key features

Pruned to remove unneeded parameters and reduce computational overhead.
Fine-tuned on the COCO dataset post-compression to restore or retain high accuracy.
Minimal performance Loss across various compression strengths, preserving mAP in most scenarios.

Model architecture

The base architecture is YOLOv5s from Ultralytics. Modifications include:

Pruning certain channels/kernels based on AIminify’s pruning algorithm.
Automatic Fine-Tuning post-pruning to recover performance.

Despite its reduced size, this model maintains similar detection capabilities for common objects as the original YOLOv5s.

Intended use

Primary use case: General object detection (people, vehicles, animals, etc.) in images and videos.
Industries: Could be applied in retail (store analytics), security, robotics, autonomous vehicles, or any scenario where a fast, lightweight detector is beneficial.
Resource-constrained environments: Ideal for devices or deployments where GPU/CPU resources are limited or when high throughput is required.

Limitations

Dataset bias: Trained on COCO, which may not generalize to highly domain-specific use cases (e.g., medical imaging, satellite imagery). Additional domain-specific fine-tuning might be necessary.
Performance variations: Depending on the chosen compression strength, there might be a slight reduction in accuracy relative to the uncompressed YOLOv5s model.

Metrics and Performance

Below is a summary of performance across different compression strengths on COCO’s mAP50-95 metric, FLOPS, number of parameters, and model size:

Compression Strength	GFLOPS	Parameters	Model Size (MB)	mAP50-95
0 (baseline)	24.0	~9.1M	36.8	0.412
1	22.3	~8.6M	34.8	0.406
2	21.1	~8.3M	33.3	0.400
3	19.9	~7.9M	31.7	0.394
4	18.6	~7.5M	30.1	0.377
5	17.3	~7.0M	28.4	0.365

Even at the highest compression strength (5), the model significantly outperforms a smaller YOLOv5n baseline while being more resource-efficient than the original YOLOv5s.

How to use

import torch

if __name__ == '__main__':
    model_file = 'YOLOv5_compression_strength_5_unquantized.pt'
    model = torch.load(model_file, map_location='cpu')
    model.eval()

    inputs = torch.randn(1, 3, 640, 640)
    results = model(inputs)

aiminify
/

yolov5s-compressed