RyanJames
/

yolo12l-person-seg

@@ -1,46 +1,59 @@
 ---
 license: agpl-3.0
 tags:
-  - yolo
-  - yolo12
-  - segmentation
-  - object-detection
-  - person-detection
-  - instance-segmentation
-  - pytorch
-  - ultralytics
-  - computer-vision
 datasets:
-  - coco
 ---
 # YOLO12-seg Person Segmentation Model
-A YOLO12-large (YOLO12l) instance segmentation model trained specifically for detecting and segmenting people with high precision.
 ## Model Description
-This model is a fine-tuned YOLO12-seg model optimized exclusively for person segmentation. It uses the large (L) scale configuration of YOLO12, featuring 28.76M parameters and 510 layers with a depth and width of 1.0.
 ### Key Features
-- **Single-Class Focus**: Specialized in detecting only people
-- **Detailed Segmentation**: Provides pixel-perfect segmentation masks
-- **High Throughput**: Optimized for processing hundreds of images per minute
-- **Quality-Optimized**: Trained specifically for accurate boundary delineation
-- **GPU-Optimized**: The Large (L) model is designed for GPU deployment, not edge devices or mobile phones
 ## Training
 The model was trained on a filtered version of the COCO dataset containing only images with people:
-- **Training Images**: 64,114 images containing people
-- **Validation Images**: 2,693 images containing people
-- **Training Details**:
-  - Initially trained for 100 epochs
-  - Input resolution: 640×640
-  - Class-focused optimization with `single_cls=True` and `classes=0`
-  - Optimized for segmentation with `overlap_mask=True` and `mask_ratio=4`
 ## Performance
@@ -48,14 +61,17 @@ The model achieves the following metrics on the COCO person validation set:
 | Metric              | Value |
 | ------------------- | ----- |
-| Box mAP50-95 (COCO) | 0.628 |
-| Box mAP50 (COCO)    | 0.840 |
-| Mask mAP50-95       | 0.524 |
-| Mask mAP50          | 0.821 |
-| Box Precision       | 0.835 |
-| Box Recall          | 0.745 |
 | Mask Precision      | 0.843 |
-| Mask Recall         | 0.723 |
 These metrics were computed on the standard COCO `val2017` validation set.
@@ -76,17 +92,20 @@ These metrics were computed on the standard COCO `val2017` validation set.
   <img src="examples/example5.png" alt="Person segmentation example 5" style="max-width:90%;" />
 </div>
-The model effectively segments people in various poses, lighting conditions, and contexts, providing accurate masks even with complex backgrounds. As shown in these examples, the segmentation masks (highlighted in color) precisely outline the human subjects, making this model ideal for applications requiring detailed person isolation.
 ## Use Cases
 This model is ideal for applications requiring precise person segmentation:
-- Human-centric image editing
-- Background removal focused on people
-- Virtual try-on applications
-- People counting and crowd analysis
-- Smart surveillance systems
 ## Usage
@@ -96,7 +115,7 @@ The model can be used directly with Ultralytics YOLO:
 from ultralytics import YOLO
 # Load the model
-model = YOLO('path/to/yolo12l-person-seg.pt')
 # Perform inference
 results = model('image.jpg')
@@ -121,7 +140,7 @@ import numpy as np
 from ultralytics import YOLO
 # Load the model and image
-model = YOLO('path/to/yolo12l-person-seg.pt')
 image = cv2.imread('image.jpg')
 # Perform inference
@@ -144,12 +163,14 @@ if result.masks is not None:
 ## Limitations
-- This model is optimized for person segmentation only and won't detect other classes
-- Performance may be reduced in extreme lighting conditions
-- Occluded persons may have incomplete segmentation masks
-- Small or distant people might not be detected as reliably as those in foreground
-- **GPU Recommended**: As a Large (L) model, real-time inference performance benefits from a dedicated GPU
-- **Edge Device Limitations**: Not optimized for mobile or edge deployment (consider YOLO12n or YOLO12s for those use cases)
 ## License
@@ -157,4 +178,6 @@ This model is available under the GNU Affero General Public License v3.0 (AGPL-3
 ### License Note
-This model was trained using the Ultralytics YOLO framework, which is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). As per the terms of the AGPL-3.0 license, any derivative works (including trained models) must also be distributed under the same license.

 ---
 license: agpl-3.0
 tags:
+    - yolo
+    - yolo12
+    - segmentation
+    - object-detection
+    - person-detection
+    - instance-segmentation
+    - pytorch
+    - ultralytics
+    - computer-vision
 datasets:
+    - coco
 ---
 # YOLO12-seg Person Segmentation Model
+A YOLO12-large (YOLO12l) instance segmentation model trained specifically for detecting and
+segmenting people with high precision.
 ## Model Description
+This model is a fine-tuned YOLO12-seg model optimized exclusively for person segmentation. It uses
+the large (L) scale configuration of YOLO12, featuring 28.76M parameters and 510 layers with a depth
+and width of 1.0.
 ### Key Features
+-   **Single-Class Focus**: Specialized in detecting only people
+-   **Detailed Segmentation**: Provides pixel-perfect segmentation masks
+-   **High Throughput**: Optimized for processing hundreds of images per minute
+-   **Quality-Optimized**: Trained specifically for accurate boundary delineation
+-   **GPU-Optimized**: The Large (L) model is designed for GPU deployment, not edge devices or
+    mobile phones
+### Available Models
+This repository contains two model versions:
+-   `yolo12l-person-seg.pt`: The original model trained for 100 epochs.
+-   `yolo12l-person-seg-extended.pt`: The improved model after extended training for 300 epochs
+    (recommended).
 ## Training
 The model was trained on a filtered version of the COCO dataset containing only images with people:
+-   **Training Images**: 64,114 images containing people
+-   **Validation Images**: 2,693 images containing people
+-   **Training Details**:
+    -   Initially trained for 100 epochs, then extended training continued for a total of 300
+        epochs.
+    -   Input resolution: 640×640
+    -   Class-focused optimization with `single_cls=True` and `classes=0`
+    -   Optimized for segmentation with `overlap_mask=True` and `mask_ratio=4`
 ## Performance
 | Metric              | Value |
 | ------------------- | ----- |
+| Box mAP50-95 (COCO) | 0.642 |
+| Box mAP50 (COCO)    | 0.851 |
+| Mask mAP50-95       | 0.537 |
+| Mask mAP50          | 0.837 |
+| Box Precision       | 0.840 |
+| Box Recall          | 0.759 |
 | Mask Precision      | 0.843 |
+| Mask Recall         | 0.748 |
+Note: These metrics reflect the performance of the extended 300-epoch model
+(`yolo12l-person-seg-extended.pt`).
 These metrics were computed on the standard COCO `val2017` validation set.
   <img src="examples/example5.png" alt="Person segmentation example 5" style="max-width:90%;" />
 </div>
+The model effectively segments people in various poses, lighting conditions, and contexts, providing
+accurate masks even with complex backgrounds. As shown in these examples, the segmentation masks
+(highlighted in color) precisely outline the human subjects, making this model ideal for
+applications requiring detailed person isolation.
 ## Use Cases
 This model is ideal for applications requiring precise person segmentation:
+-   Human-centric image editing
+-   Background removal focused on people
+-   Virtual try-on applications
+-   People counting and crowd analysis
+-   Smart surveillance systems
 ## Usage
 from ultralytics import YOLO
 # Load the model
+model = YOLO('path/to/yolo12l-person-seg-extended.pt') # Or yolo12l-person-seg.pt for the original
 # Perform inference
 results = model('image.jpg')
 from ultralytics import YOLO
 # Load the model and image
+model = YOLO('path/to/yolo12l-person-seg-extended.pt') # Or yolo12l-person-seg.pt for the original
 image = cv2.imread('image.jpg')
 # Perform inference
 ## Limitations
+-   This model is optimized for person segmentation only and won't detect other classes
+-   Performance may be reduced in extreme lighting conditions
+-   Occluded persons may have incomplete segmentation masks
+-   Small or distant people might not be detected as reliably as those in foreground
+-   **GPU Recommended**: As a Large (L) model, real-time inference performance benefits from a
+    dedicated GPU
+-   **Edge Device Limitations**: Not optimized for mobile or edge deployment (consider YOLO12n or
+    YOLO12s for those use cases)
 ## License
 ### License Note
+This model was trained using the Ultralytics YOLO framework, which is licensed under the GNU Affero
+General Public License v3.0 (AGPL-3.0). As per the terms of the AGPL-3.0 license, any derivative
+works (including trained models) must also be distributed under the same license.

yolo12l-person-seg-extended.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7aeafb431135d431a9754f1c6c96303c41fa6b83c50587c330ec93700c72f9b0
+size 58189122