YOLOv12‑x Object Detector

Ultralytics’s attention‑centric, real‑time object detection model YOLOv12‑x is now available on Hugging Face.


🧠 Model Description

YOLOv12‑x builds on the YOLO12 family by combining Area Attention and R‑ELAN modules to deliver state‑of‑the‑art detection accuracy with fewer parameters and FLOPs. Optional FlashAttention integration further reduces memory access overhead and boosts inference speed on modern NVIDIA GPUs citeturn0view0.


⚙️ Requirements

  • Python ≥ 3.8

  • PyTorch ≥ 1.10 (CUDA‑enabled)

  • CUDA ≥ 11.2 compatible GPU

  • Optional: FlashAttention (install via pip install flash-attn)

  • Recommended GPU architectures for FlashAttention support:

    • Turing (e.g. T4, Quadro RTX)
    • Ampere (RTX 30 series, A30/40/100)
    • Ada Lovelace (RTX 40 series)
    • Hopper (H100/H200) citeturn0view0
  • System specs: ≥ 8 GB RAM, ≥ 50 GB free disk


🚀 Installation & Usage

pip install ultralytics
# (Optional for FlashAttention)
pip install flash-attn

Python example:

from ultralytics import YOLO

# Load a COCO-pretrained YOLO12x model
model = YOLO("yolo12x.pt")

# Train the model on the COCO8 example dataset for 100 epochs
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)

# Run inference with the YOLO12n model on the 'bus.jpg' image
results = model("path/to/bus.jpg")

CLI example:

yolo detect predict model=yolov12x.pt source=test.jpg imgsz=640 conf=0.25

📊 Performance & Use Cases

Benchmarked on COCO val2017 at 640 × 640 resolution on an NVIDIA T4 GPU:

Model [email protected]:0.95 Latency (ms) Params (M) FLOPs (B)
YOLO12‑x 55.2 % 11.79 59.1 199.0 citeturn0view0

YOLOv12‑x excels in scenarios demanding both high accuracy and near‑real‑time throughput:

  • Autonomous vehicles
  • Industrial inspection
  • Surveillance & security systems

📚 References

@article{tian2025yolov12,
  title={YOLOv12: Attention-Centric Real-Time Object Detectors},
  author={Tian, Yunjie and Ye, Qixiang and Doermann, David},
  journal={arXiv preprint arXiv:2502.12524},
  year={2025}
}

📝 Summary

Feature Details
Model YOLOv12‑x
Architecture Area Attention + R‑ELAN
FlashAttention Optional (GPU‑accelerated)
Requirements Python ≥ 3.8, PyTorch ≥ 1.10, CUDA ≥ 11.2
Use Cases Real‑time object detection with high accuracy
Files:
├── yolov12x.pt          # Trained model weights
├── README.md            # This file
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for momererkoc/yolov12x

Finetunes
1 model