Ryan Pfister
commited on
Commit
·
f98dfcd
1
Parent(s):
4b98062
feat: Add extended 300-epoch model and update stats
Browse files- README.md +69 -46
- yolo12l-person-seg-extended.pt +3 -0
README.md
CHANGED
@@ -1,46 +1,59 @@
|
|
1 |
---
|
2 |
license: agpl-3.0
|
3 |
tags:
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
datasets:
|
14 |
-
|
15 |
---
|
16 |
|
17 |
# YOLO12-seg Person Segmentation Model
|
18 |
|
19 |
-
A YOLO12-large (YOLO12l) instance segmentation model trained specifically for detecting and
|
|
|
20 |
|
21 |
## Model Description
|
22 |
|
23 |
-
This model is a fine-tuned YOLO12-seg model optimized exclusively for person segmentation. It uses
|
|
|
|
|
24 |
|
25 |
### Key Features
|
26 |
|
27 |
-
-
|
28 |
-
-
|
29 |
-
-
|
30 |
-
-
|
31 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
32 |
|
33 |
## Training
|
34 |
|
35 |
The model was trained on a filtered version of the COCO dataset containing only images with people:
|
36 |
|
37 |
-
-
|
38 |
-
-
|
39 |
-
-
|
40 |
-
|
41 |
-
|
42 |
-
|
43 |
-
|
|
|
44 |
|
45 |
## Performance
|
46 |
|
@@ -48,14 +61,17 @@ The model achieves the following metrics on the COCO person validation set:
|
|
48 |
|
49 |
| Metric | Value |
|
50 |
| ------------------- | ----- |
|
51 |
-
| Box mAP50-95 (COCO) | 0.
|
52 |
-
| Box mAP50 (COCO) | 0.
|
53 |
-
| Mask mAP50-95 | 0.
|
54 |
-
| Mask mAP50 | 0.
|
55 |
-
| Box Precision | 0.
|
56 |
-
| Box Recall | 0.
|
57 |
| Mask Precision | 0.843 |
|
58 |
-
| Mask Recall | 0.
|
|
|
|
|
|
|
59 |
|
60 |
These metrics were computed on the standard COCO `val2017` validation set.
|
61 |
|
@@ -76,17 +92,20 @@ These metrics were computed on the standard COCO `val2017` validation set.
|
|
76 |
<img src="examples/example5.png" alt="Person segmentation example 5" style="max-width:90%;" />
|
77 |
</div>
|
78 |
|
79 |
-
The model effectively segments people in various poses, lighting conditions, and contexts, providing
|
|
|
|
|
|
|
80 |
|
81 |
## Use Cases
|
82 |
|
83 |
This model is ideal for applications requiring precise person segmentation:
|
84 |
|
85 |
-
-
|
86 |
-
-
|
87 |
-
-
|
88 |
-
-
|
89 |
-
-
|
90 |
|
91 |
## Usage
|
92 |
|
@@ -96,7 +115,7 @@ The model can be used directly with Ultralytics YOLO:
|
|
96 |
from ultralytics import YOLO
|
97 |
|
98 |
# Load the model
|
99 |
-
model = YOLO('path/to/yolo12l-person-seg.pt')
|
100 |
|
101 |
# Perform inference
|
102 |
results = model('image.jpg')
|
@@ -121,7 +140,7 @@ import numpy as np
|
|
121 |
from ultralytics import YOLO
|
122 |
|
123 |
# Load the model and image
|
124 |
-
model = YOLO('path/to/yolo12l-person-seg.pt')
|
125 |
image = cv2.imread('image.jpg')
|
126 |
|
127 |
# Perform inference
|
@@ -144,12 +163,14 @@ if result.masks is not None:
|
|
144 |
|
145 |
## Limitations
|
146 |
|
147 |
-
-
|
148 |
-
-
|
149 |
-
-
|
150 |
-
-
|
151 |
-
-
|
152 |
-
|
|
|
|
|
153 |
|
154 |
## License
|
155 |
|
@@ -157,4 +178,6 @@ This model is available under the GNU Affero General Public License v3.0 (AGPL-3
|
|
157 |
|
158 |
### License Note
|
159 |
|
160 |
-
This model was trained using the Ultralytics YOLO framework, which is licensed under the GNU Affero
|
|
|
|
|
|
1 |
---
|
2 |
license: agpl-3.0
|
3 |
tags:
|
4 |
+
- yolo
|
5 |
+
- yolo12
|
6 |
+
- segmentation
|
7 |
+
- object-detection
|
8 |
+
- person-detection
|
9 |
+
- instance-segmentation
|
10 |
+
- pytorch
|
11 |
+
- ultralytics
|
12 |
+
- computer-vision
|
13 |
datasets:
|
14 |
+
- coco
|
15 |
---
|
16 |
|
17 |
# YOLO12-seg Person Segmentation Model
|
18 |
|
19 |
+
A YOLO12-large (YOLO12l) instance segmentation model trained specifically for detecting and
|
20 |
+
segmenting people with high precision.
|
21 |
|
22 |
## Model Description
|
23 |
|
24 |
+
This model is a fine-tuned YOLO12-seg model optimized exclusively for person segmentation. It uses
|
25 |
+
the large (L) scale configuration of YOLO12, featuring 28.76M parameters and 510 layers with a depth
|
26 |
+
and width of 1.0.
|
27 |
|
28 |
### Key Features
|
29 |
|
30 |
+
- **Single-Class Focus**: Specialized in detecting only people
|
31 |
+
- **Detailed Segmentation**: Provides pixel-perfect segmentation masks
|
32 |
+
- **High Throughput**: Optimized for processing hundreds of images per minute
|
33 |
+
- **Quality-Optimized**: Trained specifically for accurate boundary delineation
|
34 |
+
- **GPU-Optimized**: The Large (L) model is designed for GPU deployment, not edge devices or
|
35 |
+
mobile phones
|
36 |
+
|
37 |
+
### Available Models
|
38 |
+
|
39 |
+
This repository contains two model versions:
|
40 |
+
|
41 |
+
- `yolo12l-person-seg.pt`: The original model trained for 100 epochs.
|
42 |
+
- `yolo12l-person-seg-extended.pt`: The improved model after extended training for 300 epochs
|
43 |
+
(recommended).
|
44 |
|
45 |
## Training
|
46 |
|
47 |
The model was trained on a filtered version of the COCO dataset containing only images with people:
|
48 |
|
49 |
+
- **Training Images**: 64,114 images containing people
|
50 |
+
- **Validation Images**: 2,693 images containing people
|
51 |
+
- **Training Details**:
|
52 |
+
- Initially trained for 100 epochs, then extended training continued for a total of 300
|
53 |
+
epochs.
|
54 |
+
- Input resolution: 640×640
|
55 |
+
- Class-focused optimization with `single_cls=True` and `classes=0`
|
56 |
+
- Optimized for segmentation with `overlap_mask=True` and `mask_ratio=4`
|
57 |
|
58 |
## Performance
|
59 |
|
|
|
61 |
|
62 |
| Metric | Value |
|
63 |
| ------------------- | ----- |
|
64 |
+
| Box mAP50-95 (COCO) | 0.642 |
|
65 |
+
| Box mAP50 (COCO) | 0.851 |
|
66 |
+
| Mask mAP50-95 | 0.537 |
|
67 |
+
| Mask mAP50 | 0.837 |
|
68 |
+
| Box Precision | 0.840 |
|
69 |
+
| Box Recall | 0.759 |
|
70 |
| Mask Precision | 0.843 |
|
71 |
+
| Mask Recall | 0.748 |
|
72 |
+
|
73 |
+
Note: These metrics reflect the performance of the extended 300-epoch model
|
74 |
+
(`yolo12l-person-seg-extended.pt`).
|
75 |
|
76 |
These metrics were computed on the standard COCO `val2017` validation set.
|
77 |
|
|
|
92 |
<img src="examples/example5.png" alt="Person segmentation example 5" style="max-width:90%;" />
|
93 |
</div>
|
94 |
|
95 |
+
The model effectively segments people in various poses, lighting conditions, and contexts, providing
|
96 |
+
accurate masks even with complex backgrounds. As shown in these examples, the segmentation masks
|
97 |
+
(highlighted in color) precisely outline the human subjects, making this model ideal for
|
98 |
+
applications requiring detailed person isolation.
|
99 |
|
100 |
## Use Cases
|
101 |
|
102 |
This model is ideal for applications requiring precise person segmentation:
|
103 |
|
104 |
+
- Human-centric image editing
|
105 |
+
- Background removal focused on people
|
106 |
+
- Virtual try-on applications
|
107 |
+
- People counting and crowd analysis
|
108 |
+
- Smart surveillance systems
|
109 |
|
110 |
## Usage
|
111 |
|
|
|
115 |
from ultralytics import YOLO
|
116 |
|
117 |
# Load the model
|
118 |
+
model = YOLO('path/to/yolo12l-person-seg-extended.pt') # Or yolo12l-person-seg.pt for the original
|
119 |
|
120 |
# Perform inference
|
121 |
results = model('image.jpg')
|
|
|
140 |
from ultralytics import YOLO
|
141 |
|
142 |
# Load the model and image
|
143 |
+
model = YOLO('path/to/yolo12l-person-seg-extended.pt') # Or yolo12l-person-seg.pt for the original
|
144 |
image = cv2.imread('image.jpg')
|
145 |
|
146 |
# Perform inference
|
|
|
163 |
|
164 |
## Limitations
|
165 |
|
166 |
+
- This model is optimized for person segmentation only and won't detect other classes
|
167 |
+
- Performance may be reduced in extreme lighting conditions
|
168 |
+
- Occluded persons may have incomplete segmentation masks
|
169 |
+
- Small or distant people might not be detected as reliably as those in foreground
|
170 |
+
- **GPU Recommended**: As a Large (L) model, real-time inference performance benefits from a
|
171 |
+
dedicated GPU
|
172 |
+
- **Edge Device Limitations**: Not optimized for mobile or edge deployment (consider YOLO12n or
|
173 |
+
YOLO12s for those use cases)
|
174 |
|
175 |
## License
|
176 |
|
|
|
178 |
|
179 |
### License Note
|
180 |
|
181 |
+
This model was trained using the Ultralytics YOLO framework, which is licensed under the GNU Affero
|
182 |
+
General Public License v3.0 (AGPL-3.0). As per the terms of the AGPL-3.0 license, any derivative
|
183 |
+
works (including trained models) must also be distributed under the same license.
|
yolo12l-person-seg-extended.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7aeafb431135d431a9754f1c6c96303c41fa6b83c50587c330ec93700c72f9b0
|
3 |
+
size 58189122
|