Papers
arxiv:2406.05835

Mamba YOLO: SSMs-Based YOLO For Object Detection

Published on Jun 9
Authors:
,
,
,

Abstract

Propelled by the rapid advancement of deep learning technologies, the YOLO series has set a new benchmark for real-time object detectors. Researchers have continuously explored innovative applications of reparameterization, efficient layer aggregation networks, and anchor-free techniques on the foundation of YOLO. To further enhance detection performance, Transformer-based structures have been introduced, significantly expanding the model's receptive field and achieving notable performance gains. However, such improvements come at a cost, as the quadratic complexity of the self-attention mechanism increases the computational burden of the model. Fortunately, the emergence of State Space Models (SSM) as an innovative technology has effectively mitigated the issues caused by quadratic complexity. In light of these advancements, we introduce Mamba-YOLO a novel object detection model based on SSM. Mamba-YOLO not only optimizes the SSM foundation but also adapts specifically for object detection tasks. Given the potential limitations of SSM in sequence modeling, such as insufficient receptive field and weak image locality, we have designed the LSBlock and RGBlock. These modules enable more precise capture of local image dependencies and significantly enhance the robustness of the model. Extensive experimental results on the publicly available benchmark datasets COCO and VOC demonstrate that Mamba-YOLO surpasses the existing YOLO series models in both performance and competitiveness, showcasing its substantial potential and competitive edge.The PyTorch code is available at:https://github.com/HZAI-ZJNU/Mamba-YOLO

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2406.05835 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2406.05835 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2406.05835 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.