Ryan Pfister commited on
Commit
f98dfcd
·
1 Parent(s): 4b98062

feat: Add extended 300-epoch model and update stats

Browse files
Files changed (2) hide show
  1. README.md +69 -46
  2. yolo12l-person-seg-extended.pt +3 -0
README.md CHANGED
@@ -1,46 +1,59 @@
1
  ---
2
  license: agpl-3.0
3
  tags:
4
- - yolo
5
- - yolo12
6
- - segmentation
7
- - object-detection
8
- - person-detection
9
- - instance-segmentation
10
- - pytorch
11
- - ultralytics
12
- - computer-vision
13
  datasets:
14
- - coco
15
  ---
16
 
17
  # YOLO12-seg Person Segmentation Model
18
 
19
- A YOLO12-large (YOLO12l) instance segmentation model trained specifically for detecting and segmenting people with high precision.
 
20
 
21
  ## Model Description
22
 
23
- This model is a fine-tuned YOLO12-seg model optimized exclusively for person segmentation. It uses the large (L) scale configuration of YOLO12, featuring 28.76M parameters and 510 layers with a depth and width of 1.0.
 
 
24
 
25
  ### Key Features
26
 
27
- - **Single-Class Focus**: Specialized in detecting only people
28
- - **Detailed Segmentation**: Provides pixel-perfect segmentation masks
29
- - **High Throughput**: Optimized for processing hundreds of images per minute
30
- - **Quality-Optimized**: Trained specifically for accurate boundary delineation
31
- - **GPU-Optimized**: The Large (L) model is designed for GPU deployment, not edge devices or mobile phones
 
 
 
 
 
 
 
 
 
32
 
33
  ## Training
34
 
35
  The model was trained on a filtered version of the COCO dataset containing only images with people:
36
 
37
- - **Training Images**: 64,114 images containing people
38
- - **Validation Images**: 2,693 images containing people
39
- - **Training Details**:
40
- - Initially trained for 100 epochs
41
- - Input resolution: 640×640
42
- - Class-focused optimization with `single_cls=True` and `classes=0`
43
- - Optimized for segmentation with `overlap_mask=True` and `mask_ratio=4`
 
44
 
45
  ## Performance
46
 
@@ -48,14 +61,17 @@ The model achieves the following metrics on the COCO person validation set:
48
 
49
  | Metric | Value |
50
  | ------------------- | ----- |
51
- | Box mAP50-95 (COCO) | 0.628 |
52
- | Box mAP50 (COCO) | 0.840 |
53
- | Mask mAP50-95 | 0.524 |
54
- | Mask mAP50 | 0.821 |
55
- | Box Precision | 0.835 |
56
- | Box Recall | 0.745 |
57
  | Mask Precision | 0.843 |
58
- | Mask Recall | 0.723 |
 
 
 
59
 
60
  These metrics were computed on the standard COCO `val2017` validation set.
61
 
@@ -76,17 +92,20 @@ These metrics were computed on the standard COCO `val2017` validation set.
76
  <img src="examples/example5.png" alt="Person segmentation example 5" style="max-width:90%;" />
77
  </div>
78
 
79
- The model effectively segments people in various poses, lighting conditions, and contexts, providing accurate masks even with complex backgrounds. As shown in these examples, the segmentation masks (highlighted in color) precisely outline the human subjects, making this model ideal for applications requiring detailed person isolation.
 
 
 
80
 
81
  ## Use Cases
82
 
83
  This model is ideal for applications requiring precise person segmentation:
84
 
85
- - Human-centric image editing
86
- - Background removal focused on people
87
- - Virtual try-on applications
88
- - People counting and crowd analysis
89
- - Smart surveillance systems
90
 
91
  ## Usage
92
 
@@ -96,7 +115,7 @@ The model can be used directly with Ultralytics YOLO:
96
  from ultralytics import YOLO
97
 
98
  # Load the model
99
- model = YOLO('path/to/yolo12l-person-seg.pt')
100
 
101
  # Perform inference
102
  results = model('image.jpg')
@@ -121,7 +140,7 @@ import numpy as np
121
  from ultralytics import YOLO
122
 
123
  # Load the model and image
124
- model = YOLO('path/to/yolo12l-person-seg.pt')
125
  image = cv2.imread('image.jpg')
126
 
127
  # Perform inference
@@ -144,12 +163,14 @@ if result.masks is not None:
144
 
145
  ## Limitations
146
 
147
- - This model is optimized for person segmentation only and won't detect other classes
148
- - Performance may be reduced in extreme lighting conditions
149
- - Occluded persons may have incomplete segmentation masks
150
- - Small or distant people might not be detected as reliably as those in foreground
151
- - **GPU Recommended**: As a Large (L) model, real-time inference performance benefits from a dedicated GPU
152
- - **Edge Device Limitations**: Not optimized for mobile or edge deployment (consider YOLO12n or YOLO12s for those use cases)
 
 
153
 
154
  ## License
155
 
@@ -157,4 +178,6 @@ This model is available under the GNU Affero General Public License v3.0 (AGPL-3
157
 
158
  ### License Note
159
 
160
- This model was trained using the Ultralytics YOLO framework, which is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). As per the terms of the AGPL-3.0 license, any derivative works (including trained models) must also be distributed under the same license.
 
 
 
1
  ---
2
  license: agpl-3.0
3
  tags:
4
+ - yolo
5
+ - yolo12
6
+ - segmentation
7
+ - object-detection
8
+ - person-detection
9
+ - instance-segmentation
10
+ - pytorch
11
+ - ultralytics
12
+ - computer-vision
13
  datasets:
14
+ - coco
15
  ---
16
 
17
  # YOLO12-seg Person Segmentation Model
18
 
19
+ A YOLO12-large (YOLO12l) instance segmentation model trained specifically for detecting and
20
+ segmenting people with high precision.
21
 
22
  ## Model Description
23
 
24
+ This model is a fine-tuned YOLO12-seg model optimized exclusively for person segmentation. It uses
25
+ the large (L) scale configuration of YOLO12, featuring 28.76M parameters and 510 layers with a depth
26
+ and width of 1.0.
27
 
28
  ### Key Features
29
 
30
+ - **Single-Class Focus**: Specialized in detecting only people
31
+ - **Detailed Segmentation**: Provides pixel-perfect segmentation masks
32
+ - **High Throughput**: Optimized for processing hundreds of images per minute
33
+ - **Quality-Optimized**: Trained specifically for accurate boundary delineation
34
+ - **GPU-Optimized**: The Large (L) model is designed for GPU deployment, not edge devices or
35
+ mobile phones
36
+
37
+ ### Available Models
38
+
39
+ This repository contains two model versions:
40
+
41
+ - `yolo12l-person-seg.pt`: The original model trained for 100 epochs.
42
+ - `yolo12l-person-seg-extended.pt`: The improved model after extended training for 300 epochs
43
+ (recommended).
44
 
45
  ## Training
46
 
47
  The model was trained on a filtered version of the COCO dataset containing only images with people:
48
 
49
+ - **Training Images**: 64,114 images containing people
50
+ - **Validation Images**: 2,693 images containing people
51
+ - **Training Details**:
52
+ - Initially trained for 100 epochs, then extended training continued for a total of 300
53
+ epochs.
54
+ - Input resolution: 640×640
55
+ - Class-focused optimization with `single_cls=True` and `classes=0`
56
+ - Optimized for segmentation with `overlap_mask=True` and `mask_ratio=4`
57
 
58
  ## Performance
59
 
 
61
 
62
  | Metric | Value |
63
  | ------------------- | ----- |
64
+ | Box mAP50-95 (COCO) | 0.642 |
65
+ | Box mAP50 (COCO) | 0.851 |
66
+ | Mask mAP50-95 | 0.537 |
67
+ | Mask mAP50 | 0.837 |
68
+ | Box Precision | 0.840 |
69
+ | Box Recall | 0.759 |
70
  | Mask Precision | 0.843 |
71
+ | Mask Recall | 0.748 |
72
+
73
+ Note: These metrics reflect the performance of the extended 300-epoch model
74
+ (`yolo12l-person-seg-extended.pt`).
75
 
76
  These metrics were computed on the standard COCO `val2017` validation set.
77
 
 
92
  <img src="examples/example5.png" alt="Person segmentation example 5" style="max-width:90%;" />
93
  </div>
94
 
95
+ The model effectively segments people in various poses, lighting conditions, and contexts, providing
96
+ accurate masks even with complex backgrounds. As shown in these examples, the segmentation masks
97
+ (highlighted in color) precisely outline the human subjects, making this model ideal for
98
+ applications requiring detailed person isolation.
99
 
100
  ## Use Cases
101
 
102
  This model is ideal for applications requiring precise person segmentation:
103
 
104
+ - Human-centric image editing
105
+ - Background removal focused on people
106
+ - Virtual try-on applications
107
+ - People counting and crowd analysis
108
+ - Smart surveillance systems
109
 
110
  ## Usage
111
 
 
115
  from ultralytics import YOLO
116
 
117
  # Load the model
118
+ model = YOLO('path/to/yolo12l-person-seg-extended.pt') # Or yolo12l-person-seg.pt for the original
119
 
120
  # Perform inference
121
  results = model('image.jpg')
 
140
  from ultralytics import YOLO
141
 
142
  # Load the model and image
143
+ model = YOLO('path/to/yolo12l-person-seg-extended.pt') # Or yolo12l-person-seg.pt for the original
144
  image = cv2.imread('image.jpg')
145
 
146
  # Perform inference
 
163
 
164
  ## Limitations
165
 
166
+ - This model is optimized for person segmentation only and won't detect other classes
167
+ - Performance may be reduced in extreme lighting conditions
168
+ - Occluded persons may have incomplete segmentation masks
169
+ - Small or distant people might not be detected as reliably as those in foreground
170
+ - **GPU Recommended**: As a Large (L) model, real-time inference performance benefits from a
171
+ dedicated GPU
172
+ - **Edge Device Limitations**: Not optimized for mobile or edge deployment (consider YOLO12n or
173
+ YOLO12s for those use cases)
174
 
175
  ## License
176
 
 
178
 
179
  ### License Note
180
 
181
+ This model was trained using the Ultralytics YOLO framework, which is licensed under the GNU Affero
182
+ General Public License v3.0 (AGPL-3.0). As per the terms of the AGPL-3.0 license, any derivative
183
+ works (including trained models) must also be distributed under the same license.
yolo12l-person-seg-extended.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7aeafb431135d431a9754f1c6c96303c41fa6b83c50587c330ec93700c72f9b0
3
+ size 58189122