Image Segmentation
ONNX
English
biology
tayden commited on
Commit
1dee41f
·
verified ·
1 Parent(s): 218e12b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +271 -1
README.md CHANGED
@@ -7,4 +7,274 @@ language:
7
  pipeline_tag: image-segmentation
8
  tags:
9
  - biology
10
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  pipeline_tag: image-segmentation
8
  tags:
9
  - biology
10
+ ---
11
+
12
+ # Kelp-RGB: Kelp Segmentation Model for RGB Drone Imagery
13
+
14
+ **Model Type:** ONNX Semantic Segmentation
15
+ **Application:** Kelp forest detection in high-resolution RGB aerial imagery
16
+ **Input:** 3-band RGB imagery (Red, Green, Blue)
17
+ **Output:** Binary segmentation mask (kelp vs. non-kelp)
18
+
19
+ ## Model Description
20
+
21
+ The Kelp-RGB model is a deep learning semantic segmentation model specifically trained for detecting kelp forests in RGB drone imagery. This model processes standard RGB imagery to provide accurate kelp segmentation for marine habitat monitoring and research, making it accessible for standard consumer drones and cameras.
22
+
23
+ **Key Features:**
24
+ - Optimized for standard RGB imagery from drones
25
+ - ImageNet-pretrained normalization statistics
26
+ - Efficient ONNX format for cross-platform deployment
27
+ - Designed for high-resolution aerial photography (~3-7cm resolution)
28
+
29
+ ## Model Details
30
+
31
+ - **Version:** 20250728
32
+ - **Input Channels:** 3 (RGB)
33
+ - **Input Size:** Dynamic tiling (recommended: 2048x2048 tiles)
34
+ - **Normalization:** Standard (ImageNet statistics)
35
+ - **Output:** Multi-class segmentation (0: background, 1: giant kelp, 2: bull kelp)
36
+ - **Format:** ONNX
37
+
38
+ ### Normalization Parameters
39
+
40
+ The model expects input images to be normalized using ImageNet statistics:
41
+
42
+ ```json
43
+ {
44
+ "mean": [0.485, 0.456, 0.406],
45
+ "std": [0.229, 0.224, 0.225],
46
+ "max_pixel_value": 255.0
47
+ }
48
+ ```
49
+
50
+ ## Usage
51
+
52
+ ### 1. Using kelp-o-matic CLI (recommended)
53
+
54
+ For command-line usage:
55
+
56
+ ```bash
57
+ # Install kelp-o-matic
58
+ pip install kelp-o-matic
59
+ # or
60
+ conda install -c conda-forge kelp-o-matic
61
+
62
+ # List available models
63
+ kom list-models
64
+
65
+ # Run kelp species segmentation on RGB drone imagery
66
+ kom segment \
67
+ --model kelp-rgb \
68
+ --input /path/to/rgb_drone_image.tif \
69
+ --output /path/to/kelp_species_segmentation.tif \
70
+ --batch-size 8 \
71
+ --crop-size 2048 \
72
+ --blur-kernel 5 \
73
+ --morph-kernel 3
74
+
75
+ # Use specific model version
76
+ kom segment \
77
+ --model kelp-rgb \
78
+ --version 20250728 \
79
+ --input image.tif \
80
+ --output result.tif
81
+
82
+ # For high-resolution imagery, use larger tiles
83
+ kom segment \
84
+ --model kelp-rgb \
85
+ --input high_res_drone_image.tif \
86
+ --output result.tif \
87
+ --batch-size 4 \
88
+ --crop-size 1024
89
+ ```
90
+
91
+ ### 2. Using kelp-o-matic Python API
92
+
93
+ The easiest way to use this model is through the kelp-o-matic package:
94
+
95
+ ```python
96
+ from kelp_o_matic import model_registry
97
+
98
+ # Load the model (automatically downloads if needed)
99
+ model = model_registry["kelp-rgb"]
100
+
101
+ # Process a large aerial image with automatic tiling
102
+ model.process(
103
+ input_path="path/to/your/rgb_drone_image.tif",
104
+ output_path="path/to/output/kelp_species_segmentation.tif",
105
+ batch_size=8, # Higher batch size for RGB
106
+ crop_size=2048,
107
+ blur_kernel_size=5, # Post-processing median blur
108
+ morph_kernel_size=3, # Morphological operations
109
+ )
110
+
111
+ # For more control, use the predict method directly
112
+ import rasterio
113
+ import numpy as np
114
+
115
+ with rasterio.open("drone_image.tif") as src:
116
+ # Read a 2048x2048 tile (3 bands: RGB)
117
+ tile = src.read(window=((0, 2048), (0, 2048))) # Shape: (3, 2048, 2048)
118
+ tile = np.transpose(tile, (1, 2, 0)) # Convert to HWC
119
+
120
+ # Add batch dimension and predict
121
+ batch = np.expand_dims(tile, axis=0) # Shape: (1, 2048, 2048, 3)
122
+ batch = np.transpose(batch, (0, 3, 1, 2)) # Convert to BCHW
123
+
124
+ # Run inference (preprocessing handled automatically)
125
+ predictions = model.predict(batch)
126
+
127
+ # Post-process to get final segmentation
128
+ segmentation = model.postprocess(predictions)
129
+ # Result: 0=background, 1=giant kelp, 2=bull kelp
130
+ ```
131
+
132
+ ### 3. Direct ONNX Runtime Usage
133
+
134
+ ```python
135
+ import numpy as np
136
+ import onnxruntime as ort
137
+ from huggingface_hub import hf_hub_download
138
+ from PIL import Image
139
+
140
+ # Download the model
141
+ model_path = hf_hub_download(repo_id="HakaiInstitute/kelp-rgb", filename="model.onnx")
142
+
143
+ # Load the model
144
+ session = ort.InferenceSession(model_path)
145
+
146
+ # ImageNet normalization parameters
147
+ mean = np.array([0.485, 0.456, 0.406])
148
+ std = np.array([0.229, 0.224, 0.225])
149
+
150
+ # Preprocess your RGB image
151
+ def preprocess(image):
152
+ """
153
+ Preprocess RGB image for model input
154
+ image: numpy array of shape [height, width, 3] with pixel values 0-255
155
+ """
156
+ # Normalize to 0-1
157
+ image = image.astype(np.float32) / 255.0
158
+
159
+ # Apply ImageNet normalization
160
+ image = (image - mean) / std
161
+
162
+ # Reshape to model input format [batch, channels, height, width]
163
+ image = np.transpose(image, (2, 0, 1)) # HWC to CHW
164
+ image = np.expand_dims(image, axis=0) # Add batch dimension
165
+
166
+ return image
167
+
168
+ # Load and preprocess image
169
+ image = np.array(Image.open("drone_image.jpg"))
170
+ preprocessed = preprocess(image)
171
+
172
+ # Run inference
173
+ input_name = session.get_inputs()[0].name
174
+ output = session.run(None, {input_name: preprocessed})
175
+
176
+ # Postprocess to get class predictions
177
+ logits = output[0] # Raw probabilities for each class
178
+ prediction = np.argmax(logits, axis=1).squeeze(0).astype(np.uint8)
179
+ # Result: 0=background, 1=giant kelp, 2=bull kelp
180
+ ```
181
+
182
+ ### 4. Using HuggingFace Hub Integration
183
+
184
+ ```python
185
+ from huggingface_hub import hf_hub_download
186
+ import onnxruntime as ort
187
+
188
+ # Download and load model
189
+ model_path = hf_hub_download(
190
+ repo_id="HakaiInstitute/kelp-rgb",
191
+ filename="model.onnx",
192
+ cache_dir="./models"
193
+ )
194
+
195
+ session = ort.InferenceSession(model_path)
196
+ # ... continue with preprocessing and inference as above
197
+ ```
198
+
199
+ ## Installation
200
+
201
+ ### For kelp-o-matic usage:
202
+
203
+ ```bash
204
+ # Via pip
205
+ pip install kelp-o-matic
206
+
207
+ # Via conda
208
+ conda install -c conda-forge kelp-o-matic
209
+ ```
210
+
211
+ ### For direct ONNX usage:
212
+
213
+ ```bash
214
+ pip install onnxruntime huggingface-hub numpy pillow
215
+ # For GPU support:
216
+ pip install onnxruntime-gpu
217
+ ```
218
+
219
+ ## Input Requirements
220
+
221
+ - **Image Format:** 3-band RGB raster (JPEG, PNG, GeoTIFF)
222
+ - **Band Order:** Red, Green, Blue
223
+ - **Pixel Values:** Standard 8-bit (0-255 range) or 16-bit imagery
224
+ - **Spatial Resolution:** Optimized for high-resolution drone imagery (cm-level)
225
+
226
+ ## Output Format
227
+
228
+ - **Type:** Single-band raster with class labels
229
+ - **Values:**
230
+ - 0: Background (water, other features)
231
+ - 1: *Macrocystis pyrifera* (Giant kelp)
232
+ - 2: *Nereocystis luetkeana* (Bull kelp)
233
+ - **Format:** Matches input raster format and projection
234
+ - **Spatial Resolution:** Same as input
235
+
236
+ **Note:** The model outputs class probabilities, but kelp-o-matic automatically applies argmax to convert these to discrete class labels.
237
+
238
+ ## Performance Notes
239
+
240
+ - **Dynamic Tile Size:** Supports flexible tile sizes (recommended: 2048x2048 or 1024x1024)
241
+ - **Batch Size:** Start with 4, adjust based on available GPU memory
242
+
243
+ ## Large Image Processing
244
+
245
+ For processing large geospatial images, the kelp-o-matic package handles:
246
+
247
+ - **Automatic Tiling:** Splits large images into manageable tiles
248
+ - **Overlap Handling:** Uses overlapping tiles to avoid edge artifacts
249
+ - **Memory Management:** Processes tiles in batches to manage memory usage
250
+ - **Geospatial Metadata:** Preserves coordinate reference system and geotransforms
251
+ - **Post-processing:** Optional median filtering and morphological operations
252
+
253
+ ## Citation
254
+
255
+ If you use this model in your research, please cite:
256
+
257
+ ```bibtex
258
+ @software{Denouden_Kelp-O-Matic,
259
+ author = {Denouden, Taylor and Reshitnyk, Luba},
260
+ doi = {10.5281/zenodo.7672166},
261
+ title = {{Kelp-O-Matic}},
262
+ url = {https://github.com/HakaiInstitute/kelp-o-matic}
263
+ }
264
+ ```
265
+
266
+ ## License
267
+
268
+ MIT License - see the [kelp-o-matic repository](https://github.com/HakaiInstitute/kelp-o-matic/blob/main/LICENSE) for details.
269
+
270
+ ## Related Resources
271
+
272
+ - **Documentation:** [kelp-o-matic.readthedocs.io](https://kelp-o-matic.readthedocs.io)
273
+ - **Source Code:** [github.com/HakaiInstitute/kelp-o-matic](https://github.com/HakaiInstitute/kelp-o-matic)
274
+ - **Other Models:** Check the [Hakai Institute HuggingFace organization](https://huggingface.co/HakaiInstitute) for additional kelp segmentation models
275
+
276
+ ## Contact
277
+
278
+ For questions or issues:
279
+ - Open an issue on the [GitHub repository](https://github.com/HakaiInstitute/kelp-o-matic/issues)
280
+ - Contact: [Hakai Institute](https://www.hakai.org)