|
--- |
|
language: |
|
- en |
|
base_model: |
|
- Ultralytics/YOLO11 |
|
pipeline_tag: object-detection |
|
tags: |
|
- soccer |
|
- football |
|
- player |
|
- ball |
|
- referee |
|
- detection |
|
- analysis |
|
- ultralytics |
|
- pitch |
|
datasets: |
|
- Adit-jain/Soccana_player_ball_detection_v1 |
|
--- |
|
|
|
# โฝ SoccerNet Object Detection Model (YOLOv11) |
|
|
|
1. [Introduction](#introduction) |
|
2. [Demo](#demo) |
|
3. [Model Capabilities](#model-capabilities) |
|
4. [Architecture & Technical Specifications](#architecture--technical-specifications) |
|
5. [Implementation & Usage](#implementation--usage) |
|
|
|
--- |
|
|
|
## Introduction |
|
|
|
The **Soccer Object Detection Model** is a computer vision solution specifically designed for comprehensive soccer video analysis. Built upon the **YOLOv11n** architecture and trained on a meticulously curated multi-source dataset, this model provides real-time detection of players, balls, and referees in soccer videos across diverse conditions and environments. |
|
|
|
This model serves as the foundation for the complete Soccer Analysis Pipeline, enabling advanced capabilities such as player tracking, team assignment, tactical analysis, and performance metrics extraction. |
|
|
|
### Key Features |
|
- **Multi-class Detection**: Simultaneous detection of players, balls, and referees |
|
- **Real-time Performance**: Optimized for live video analysis (30+ FPS) |
|
- **Scale-Invariant**: Effective detection across different camera distances and angles |
|
- **Robust Performance**: Trained on diverse datasets with varying lighting, weather, and field conditions |
|
- **EdgeCase-Optimized**: Specifically fine-tuned for soccer scenarios and edge cases |
|
|
|
|
|
--- |
|
|
|
## Demo |
|
|
|
SAMPLE LINK : [DRIVE](https://drive.google.com/file/d/1XWEvUuWHv3peKNvTeZiTLyjNtrnYD_RZ/view?usp=sharing) |
|
|
|
Note : This sample uses Kmeans, UMAP and SIGLIP for team assignment. This does not have Re-identification applied, hence the large player numbers. |
|
<p> |
|
<img src="thumbnail.jpg" width="600"/> |
|
</p> |
|
|
|
--- |
|
|
|
## Model Capabilities |
|
|
|
### Detection Classes |
|
The model is trained to detect three primary object classes with high accuracy: |
|
|
|
| Class ID | Object Type | Description | Use Case | |
|
|----------|-------------|-------------|----------| |
|
| **0** | **Player** | Soccer players from both teams including goalkeepers | Primary tracking target, team assignment, tactical analysis | |
|
| **1** | **Ball** | Soccer ball in various states (rolling, airborne, stationary) | Possession tracking, game flow analysis, event detection | |
|
| **2** | **Referee** | Match officials including referees and linesmen | Contextual differentiation, avoiding tracking confusion | |
|
|
|
### Multi-Scale Detection |
|
- **Close-up Shots**: High-precision detection in detailed player views |
|
- **Medium Shots**: Balanced detection for tactical analysis |
|
- **Wide-angle Views**: Full-field coverage with consistent detection quality |
|
- **Aerial Views**: Drone and elevated camera perspectives |
|
|
|
### Environmental Robustness |
|
- **Lighting Conditions**: Day games, evening matches, indoor venues, stadium lighting |
|
- **Weather Conditions**: Clear weather, rain, snow, fog conditions |
|
- **Field Surfaces**: Natural grass, artificial turf, different field conditions |
|
- **Camera Angles**: Sideline, goal-line, elevated, broadcast standard angles |
|
|
|
### Real-world Scenarios |
|
- **Crowded Scenes**: Penalty area situations with multiple overlapping players |
|
- **Occlusion Handling**: Partially visible players and objects |
|
- **Motion Blur**: Fast-moving players and ball tracking |
|
- **Scale Variation**: Players at different distances from camera |
|
|
|
### Performance Characteristics |
|
- **Detection Accuracy**: High precision with minimal false positives |
|
- **Processing Speed**: Real-time capable (30+ FPS on modern GPUs) |
|
- **Memory Efficiency**: Optimized for continuous video processing |
|
|
|
--- |
|
|
|
## Architecture & Technical Specifications |
|
|
|
### Base Architecture: YOLOv11n |
|
|
|
**YOLOv11n** (You Only Look Once version 11, nano variant) serves as the foundation architecture, providing an optimal balance between accuracy and computational efficiency. |
|
|
|
### Dataset |
|
[Soccana_player_ball_detection_v1](https://huggingface.co/datasets/Adit-jain/Soccana_player_ball_detection_v1) |
|
|
|
The dataset covers various Edge cases like: |
|
- Occlusions |
|
- Close up shots |
|
- Behind the goalpost scenes |
|
- Camera overlay scenes |
|
- Low and High angle shots |
|
- Various Resolution shots (160, 320, 540, 1280) |
|
|
|
### Training Configuration |
|
|
|
The model training follows an optimized configuration designed for soccer-specific detection tasks: |
|
|
|
```python |
|
# Core Training Parameters |
|
epochs = 200 # Extended training for convergence |
|
img_size = 1280 # High-resolution input (1280x1280) |
|
batch_size = 32 # Optimal batch size for 1280 resolution |
|
workers = 8 # Multi-threaded data loading |
|
|
|
# Learning Rate Schedule |
|
lr0 = 0.01 # Initial learning rate |
|
lrf = 0.01 # Final learning rate (no decay) |
|
momentum = 0.937 # SGD momentum |
|
weight_decay = 0.0005 # L2 regularization |
|
|
|
# Regularization & Augmentation |
|
dropout = 0.3 # Dropout rate for overfitting prevention |
|
augmentation_probability = 0.5 # Data augmentation frequency |
|
``` |
|
|
|
### Advanced Training Settings |
|
|
|
#### **Augmentation Strategy** |
|
```python |
|
# Photometric Augmentations |
|
hsv_h = 0.015 # Hue augmentation range |
|
hsv_s = 0.7 # Saturation augmentation range |
|
hsv_v = 0.4 # Value augmentation range |
|
|
|
# Geometric Augmentations |
|
degrees = 0.0 # Rotation range (disabled for sports) |
|
translate = 0.1 # Translation augmentation |
|
scale = 0.5 # Scale augmentation range |
|
shear = 0.0 # Shear transformation (disabled) |
|
|
|
# Advanced Augmentations |
|
mosaic = 1.0 # Mosaic augmentation probability |
|
mixup = 0.0 # Mixup augmentation (disabled) |
|
copy_paste = 0.0 # Copy-paste augmentation (disabled) |
|
``` |
|
|
|
#### **Loss Function Configuration** |
|
```python |
|
# Detection Loss Components |
|
box_loss_gain = 0.05 # Bounding box loss weight |
|
cls_loss_gain = 0.5 # Classification loss weight |
|
dfl_loss_gain = 1.5 # Distribution focal loss weight |
|
|
|
# Focal Loss Parameters |
|
fl_gamma = 0.0 # Focal loss gamma (disabled) |
|
label_smoothing = 0.0 # Label smoothing factor |
|
``` |
|
|
|
#### **Optimizer Settings** |
|
```python |
|
optimizer = "SGD" # Stochastic Gradient Descent |
|
nbs = 64 # Nominal batch size for scaling |
|
warmup_epochs = 3.0 # Learning rate warmup period |
|
warmup_momentum = 0.8 # Warmup momentum |
|
warmup_bias_lr = 0.1 # Warmup bias learning rate |
|
``` |
|
|
|
### Model Architecture Parameters |
|
|
|
#### **Backbone Configuration** |
|
```python |
|
depth_multiple = 0.33 # Model depth scaling factor (nano) |
|
width_multiple = 0.25 # Model width scaling factor (nano) |
|
max_channels = 1024 # Maximum channel count |
|
``` |
|
|
|
#### **Detection Head Settings** |
|
```python |
|
anchors = None # Anchor-free detection |
|
nc = 3 # Number of classes (Player, Ball, Referee) |
|
conf_threshold = 0.25 # Confidence threshold for detection |
|
iou_threshold = 0.45 # IoU threshold for NMS |
|
max_det = 300 # Maximum detections per image |
|
``` |
|
|
|
### Hardware Optimization |
|
|
|
#### **GPU Configuration** |
|
```python |
|
device = "cuda" # GPU acceleration |
|
multi_gpu = True # Multi-GPU training support |
|
amp = True # Automatic Mixed Precision |
|
half = False # FP16 inference (disabled during training) |
|
``` |
|
|
|
#### **Memory Management** |
|
```python |
|
cache = "ram" # Dataset caching strategy |
|
save_memory = False # Memory optimization mode |
|
rect = False # Rectangular training (disabled) |
|
``` |
|
|
|
--- |
|
|
|
## Implementation & Usage |
|
|
|
### Model Integration Points |
|
|
|
The Soccer Object Detection Model is seamlessly integrated throughout the Soccer Analysis Pipeline: |
|
|
|
#### **Core Detection Module** (`player_detection/`) |
|
|
|
```python |
|
from player_detection import load_detection_model, get_detections |
|
import supervision as sv |
|
|
|
# Load the trained model |
|
model = load_detection_model("Models/Trained/yolov11_sahi_1280/Model/weights/best.pt") |
|
|
|
# Perform detection on a frame |
|
player_detections, ball_detections, referee_detections = get_detections(model, frame) |
|
|
|
# Results are returned as supervision.Detections objects with: |
|
# - Bounding boxes in [x1, y1, x2, y2] format |
|
# - Confidence scores for each detection |
|
# - Class IDs (0=Player, 1=Ball, 2=Referee) |
|
``` |
|
|
|
#### **Pipeline Integration** (`pipelines/detection_pipeline.py`) |
|
|
|
```python |
|
from pipelines import DetectionPipeline |
|
|
|
# Initialize detection pipeline |
|
pipeline = DetectionPipeline(model_path) |
|
|
|
# Video-based detection |
|
pipeline.detect_in_video("input.mp4", "output_detected.mp4", frame_count=300) |
|
|
|
# Real-time detection |
|
pipeline.detect_realtime("input.mp4") # or webcam index: 0 |
|
|
|
# Frame-level detection |
|
player_det, ball_det, ref_det = pipeline.detect_frame_objects(frame) |
|
annotated_frame = pipeline.annotate_detections(frame, player_det, ball_det, ref_det) |
|
``` |
|
|
|
### A detailed guide and code can be found at **[github](https://github.com/Adit-jain/Soccer_Analysis)** |
|
|
|
--- |
|
|
|
*Quick Links* |
|
|
|
**๐ Repository**: [https://github.com/Adit-jain/Soccer_Analysis](https://github.com/Adit-jain/Soccer_Analysis) |
|
**๐ Dataset**: [https://huggingface.co/datasets/Adit-jain/Soccana_player_ball_detection_v1](https://huggingface.co/datasets/Adit-jain/Soccana_player_ball_detection_v1) |
|
**๐ค Model**: [https://huggingface.co/Adit-jain/soccana](https://huggingface.co/Adit-jain/soccana) |
|
|
|
|
|
|
|
|