YOLOv12‑x Object Detector

Ultralytics’s attention‑centric, real‑time object detection model YOLOv12‑x is now available on Hugging Face.

🧠 Model Description

YOLOv12‑x builds on the YOLO12 family by combining Area Attention and R‑ELAN modules to deliver state‑of‑the‑art detection accuracy with fewer parameters and FLOPs. Optional FlashAttention integration further reduces memory access overhead and boosts inference speed on modern NVIDIA GPUs citeturn0view0.

⚙️ Requirements

Python ≥ 3.8
PyTorch ≥ 1.10 (CUDA‑enabled)
CUDA ≥ 11.2 compatible GPU
Optional: FlashAttention (install via pip install flash-attn)
Recommended GPU architectures for FlashAttention support:
- Turing (e.g. T4, Quadro RTX)
- Ampere (RTX 30 series, A30/40/100)
- Ada Lovelace (RTX 40 series)
- Hopper (H100/H200) citeturn0view0
System specs: ≥ 8 GB RAM, ≥ 50 GB free disk

🚀 Installation & Usage

pip install ultralytics
# (Optional for FlashAttention)
pip install flash-attn

Python example:

from ultralytics import YOLO

# Load a COCO-pretrained YOLO12x model
model = YOLO("yolo12x.pt")

# Train the model on the COCO8 example dataset for 100 epochs
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)

# Run inference with the YOLO12n model on the 'bus.jpg' image
results = model("path/to/bus.jpg")

CLI example:

yolo detect predict model=yolov12x.pt source=test.jpg imgsz=640 conf=0.25

📊 Performance & Use Cases

Benchmarked on COCO val2017 at 640 × 640 resolution on an NVIDIA T4 GPU:

Model	[email protected]:0.95	Latency (ms)	Params (M)	FLOPs (B)
YOLO12‑x	55.2 %	11.79	59.1	199.0	citeturn0view0

YOLOv12‑x excels in scenarios demanding both high accuracy and near‑real‑time throughput:

Autonomous vehicles
Industrial inspection
Surveillance & security systems

📚 References

@article{tian2025yolov12,
  title={YOLOv12: Attention-Centric Real-Time Object Detectors},
  author={Tian, Yunjie and Ye, Qixiang and Doermann, David},
  journal={arXiv preprint arXiv:2502.12524},
  year={2025}
}

📝 Summary

Feature	Details
Model	YOLOv12‑x
Architecture	Area Attention + R‑ELAN
FlashAttention	Optional (GPU‑accelerated)
Requirements	Python ≥ 3.8, PyTorch ≥ 1.10, CUDA ≥ 11.2
Use Cases	Real‑time object detection with high accuracy

Files:
├── yolov12x.pt          # Trained model weights
├── README.md            # This file

momererkoc
/

yolov12x