File size: 9,731 Bytes
b5ae59f
 
 
 
 
 
 
 
 
 
 
 
 
 
1dfbb85
 
 
 
 
 
16f133f
1dfbb85
21dfd12
 
 
 
 
1dfbb85
 
 
21dfd12
 
 
 
 
 
 
 
 
 
 
 
1dfbb85
 
 
 
c64a721
 
 
 
 
 
f34699a
c64a721
 
 
 
21dfd12
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1dfbb85
21dfd12
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1dfbb85
 
 
21dfd12
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1dfbb85
 
 
21dfd12
1dfbb85
21dfd12
 
 
1dfbb85
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
---
language:
- en
base_model:
- Ultralytics/YOLO11
pipeline_tag: object-detection
tags:
- soccer
- football
- player
- ball
- referee
- detection
- analysis
- ultralytics
- pitch
datasets:
- Adit-jain/Soccana_player_ball_detection_v1
---

# โšฝ SoccerNet Object Detection Model (YOLOv11)

1. [Introduction](#introduction)
2. [Demo](#demo)
3. [Model Capabilities](#model-capabilities)
4. [Architecture & Technical Specifications](#architecture--technical-specifications)
5. [Implementation & Usage](#implementation--usage)

---

## Introduction

The **Soccer Object Detection Model** is a computer vision solution specifically designed for comprehensive soccer video analysis. Built upon the **YOLOv11n** architecture and trained on a meticulously curated multi-source dataset, this model provides real-time detection of players, balls, and referees in soccer videos across diverse conditions and environments.

This model serves as the foundation for the complete Soccer Analysis Pipeline, enabling advanced capabilities such as player tracking, team assignment, tactical analysis, and performance metrics extraction.

### Key Features
- **Multi-class Detection**: Simultaneous detection of players, balls, and referees
- **Real-time Performance**: Optimized for live video analysis (30+ FPS)
- **Scale-Invariant**: Effective detection across different camera distances and angles
- **Robust Performance**: Trained on diverse datasets with varying lighting, weather, and field conditions
- **EdgeCase-Optimized**: Specifically fine-tuned for soccer scenarios and edge cases


---

## Demo

SAMPLE LINK : [DRIVE](https://drive.google.com/file/d/1XWEvUuWHv3peKNvTeZiTLyjNtrnYD_RZ/view?usp=sharing)

Note : This sample uses Kmeans, UMAP and SIGLIP for team assignment. This does not have Re-identification applied, hence the large player numbers.
<p>
<img src="thumbnail.jpg" width="600"/>
</p>

---

## Model Capabilities

### Detection Classes
The model is trained to detect three primary object classes with high accuracy:

| Class ID | Object Type | Description | Use Case |
|----------|-------------|-------------|----------|
| **0** | **Player** | Soccer players from both teams including goalkeepers | Primary tracking target, team assignment, tactical analysis |
| **1** | **Ball** | Soccer ball in various states (rolling, airborne, stationary) | Possession tracking, game flow analysis, event detection |
| **2** | **Referee** | Match officials including referees and linesmen | Contextual differentiation, avoiding tracking confusion |

### Multi-Scale Detection
- **Close-up Shots**: High-precision detection in detailed player views
- **Medium Shots**: Balanced detection for tactical analysis
- **Wide-angle Views**: Full-field coverage with consistent detection quality
- **Aerial Views**: Drone and elevated camera perspectives

### Environmental Robustness
- **Lighting Conditions**: Day games, evening matches, indoor venues, stadium lighting
- **Weather Conditions**: Clear weather, rain, snow, fog conditions
- **Field Surfaces**: Natural grass, artificial turf, different field conditions
- **Camera Angles**: Sideline, goal-line, elevated, broadcast standard angles

### Real-world Scenarios
- **Crowded Scenes**: Penalty area situations with multiple overlapping players
- **Occlusion Handling**: Partially visible players and objects
- **Motion Blur**: Fast-moving players and ball tracking
- **Scale Variation**: Players at different distances from camera

### Performance Characteristics
- **Detection Accuracy**: High precision with minimal false positives
- **Processing Speed**: Real-time capable (30+ FPS on modern GPUs)
- **Memory Efficiency**: Optimized for continuous video processing

---

## Architecture & Technical Specifications

### Base Architecture: YOLOv11n

**YOLOv11n** (You Only Look Once version 11, nano variant) serves as the foundation architecture, providing an optimal balance between accuracy and computational efficiency.

### Dataset
[Soccana_player_ball_detection_v1](https://huggingface.co/datasets/Adit-jain/Soccana_player_ball_detection_v1)

The dataset covers various Edge cases like:
- Occlusions
- Close up shots
- Behind the goalpost scenes
- Camera overlay scenes
- Low and High angle shots
- Various Resolution shots (160, 320, 540, 1280)

### Training Configuration

The model training follows an optimized configuration designed for soccer-specific detection tasks:

```python
# Core Training Parameters
epochs = 200                    # Extended training for convergence
img_size = 1280                 # High-resolution input (1280x1280)
batch_size = 32                 # Optimal batch size for 1280 resolution
workers = 8                     # Multi-threaded data loading

# Learning Rate Schedule
lr0 = 0.01                      # Initial learning rate
lrf = 0.01                      # Final learning rate (no decay)
momentum = 0.937                # SGD momentum
weight_decay = 0.0005           # L2 regularization

# Regularization & Augmentation
dropout = 0.3                   # Dropout rate for overfitting prevention
augmentation_probability = 0.5  # Data augmentation frequency
```

### Advanced Training Settings

#### **Augmentation Strategy**
```python
# Photometric Augmentations
hsv_h = 0.015                   # Hue augmentation range
hsv_s = 0.7                     # Saturation augmentation range  
hsv_v = 0.4                     # Value augmentation range

# Geometric Augmentations
degrees = 0.0                   # Rotation range (disabled for sports)
translate = 0.1                 # Translation augmentation
scale = 0.5                     # Scale augmentation range
shear = 0.0                     # Shear transformation (disabled)

# Advanced Augmentations
mosaic = 1.0                    # Mosaic augmentation probability
mixup = 0.0                     # Mixup augmentation (disabled)
copy_paste = 0.0                # Copy-paste augmentation (disabled)
```

#### **Loss Function Configuration**
```python
# Detection Loss Components
box_loss_gain = 0.05            # Bounding box loss weight
cls_loss_gain = 0.5             # Classification loss weight  
dfl_loss_gain = 1.5             # Distribution focal loss weight

# Focal Loss Parameters
fl_gamma = 0.0                  # Focal loss gamma (disabled)
label_smoothing = 0.0           # Label smoothing factor
```

#### **Optimizer Settings**
```python
optimizer = "SGD"               # Stochastic Gradient Descent
nbs = 64                        # Nominal batch size for scaling
warmup_epochs = 3.0             # Learning rate warmup period
warmup_momentum = 0.8           # Warmup momentum
warmup_bias_lr = 0.1            # Warmup bias learning rate
```

### Model Architecture Parameters

#### **Backbone Configuration**
```python
depth_multiple = 0.33           # Model depth scaling factor (nano)
width_multiple = 0.25           # Model width scaling factor (nano)  
max_channels = 1024             # Maximum channel count
```

#### **Detection Head Settings**
```python
anchors = None                  # Anchor-free detection
nc = 3                          # Number of classes (Player, Ball, Referee)
conf_threshold = 0.25           # Confidence threshold for detection
iou_threshold = 0.45            # IoU threshold for NMS
max_det = 300                   # Maximum detections per image
```

### Hardware Optimization

#### **GPU Configuration**
```python
device = "cuda"                 # GPU acceleration  
multi_gpu = True                # Multi-GPU training support
amp = True                      # Automatic Mixed Precision
half = False                    # FP16 inference (disabled during training)
```

#### **Memory Management**  
```python
cache = "ram"                   # Dataset caching strategy
save_memory = False             # Memory optimization mode
rect = False                    # Rectangular training (disabled)
```

---

## Implementation & Usage

### Model Integration Points

The Soccer Object Detection Model is seamlessly integrated throughout the Soccer Analysis Pipeline:

#### **Core Detection Module** (`player_detection/`)

```python
from player_detection import load_detection_model, get_detections
import supervision as sv

# Load the trained model
model = load_detection_model("Models/Trained/yolov11_sahi_1280/Model/weights/best.pt")

# Perform detection on a frame
player_detections, ball_detections, referee_detections = get_detections(model, frame)

# Results are returned as supervision.Detections objects with:
# - Bounding boxes in [x1, y1, x2, y2] format  
# - Confidence scores for each detection
# - Class IDs (0=Player, 1=Ball, 2=Referee)
```

#### **Pipeline Integration** (`pipelines/detection_pipeline.py`)

```python
from pipelines import DetectionPipeline

# Initialize detection pipeline
pipeline = DetectionPipeline(model_path)

# Video-based detection
pipeline.detect_in_video("input.mp4", "output_detected.mp4", frame_count=300)

# Real-time detection  
pipeline.detect_realtime("input.mp4")  # or webcam index: 0

# Frame-level detection
player_det, ball_det, ref_det = pipeline.detect_frame_objects(frame)
annotated_frame = pipeline.annotate_detections(frame, player_det, ball_det, ref_det)
```

### A detailed guide and code can be found at **[github](https://github.com/Adit-jain/Soccer_Analysis)**

---

*Quick Links*

**๐Ÿ”— Repository**: [https://github.com/Adit-jain/Soccer_Analysis](https://github.com/Adit-jain/Soccer_Analysis)  
**๐Ÿ“Š Dataset**: [https://huggingface.co/datasets/Adit-jain/Soccana_player_ball_detection_v1](https://huggingface.co/datasets/Adit-jain/Soccana_player_ball_detection_v1)  
**๐Ÿค– Model**: [https://huggingface.co/Adit-jain/soccana](https://huggingface.co/Adit-jain/soccana)