--- library_name: keras-hub --- ### Model Overview A Keras model implementing the RetinaNet meta-architecture. Implements the RetinaNet architecture for object detection. The constructor requires `num_classes`, `bounding_box_format`, and a backbone. Optionally, a custom label encoder, and prediction decoder may be provided. ## Links * [RetinaNet Quickstart Notebook](https://www.kaggle.com/code/sineeli/retinanet-training-guide) * [RetinaNet API Documentation](coming soon) * [RetinaNet Model Card](https://huggingface.co/keras-io/Object-Detection-RetinaNet) * [KerasHub Beginner Guide](https://keras.io/guides/keras_hub/getting_started/) * [KerasHub Model Publishing Guide](https://keras.io/guides/keras_hub/upload/) ## Installation Keras and KerasHub can be installed with: ``` pip install -U -q keras-hub pip install -U -q keras ``` Jax, TensorFlow, and Torch come preinstalled in Kaggle Notebooks. For instructions on installing them in another environment see the [Keras Getting Started](https://keras.io/getting_started/) page. ## Presets The following model checkpoints are provided by the Keras team. Full code examples for each are available below. | Preset name | Parameters | Description | |----------------|------------|--------------------------------------------------| | retinanet_resnet50_fpn_coco | 34.12M | RetinaNet model with ResNet50 backbone fine-tuned on COCO in 800x800 resolution.| __Arguments__ - __num_classes__: the number of classes in your dataset excluding the background class. Classes should be represented by integers in the range [0, num_classes). - __bounding_box_format__: The format of bounding boxes of input dataset. Refer [to the keras.io docs](https://keras.io/api/keras_cv/bounding_box/formats/) for more details on supported bounding box formats. - __backbone__: `keras.Model`. If the default `feature_pyramid` is used, must implement the `pyramid_level_inputs` property with keys "P3", "P4", and "P5" and layer names as values. A somewhat sensible backbone to use in many cases is the: `keras_cv.models.ResNetBackbone.from_preset("resnet50_imagenet")` - __anchor_generator__: (Optional) a `keras_cv.layers.AnchorGenerator`. If provided, the anchor generator will be passed to both the `label_encoder` and the `prediction_decoder`. Only to be used when both `label_encoder` and `prediction_decoder` are both `None`. Defaults to an anchor generator with the parameterization: `strides=[2**i for i in range(3, 8)]`, `scales=[2**x for x in [0, 1 / 3, 2 / 3]]`, `sizes=[32.0, 64.0, 128.0, 256.0, 512.0]`, and `aspect_ratios=[0.5, 1.0, 2.0]`. - __label_encoder__: (Optional) a keras.Layer that accepts an image Tensor, a bounding box Tensor and a bounding box class Tensor to its `call()` method, and returns RetinaNet training targets. By default, a KerasCV standard `RetinaNetLabelEncoder` is created and used. Results of this object's `call()` method are passed to the `loss` object for `box_loss` and `classification_loss` the `y_true` argument. - __prediction_decoder__: (Optional) A `keras.layers.Layer` that is responsible for transforming RetinaNet predictions into usable bounding box Tensors. If not provided, a default is provided. The default `prediction_decoder` layer is a `keras_cv.layers.MultiClassNonMaxSuppression` layer, which uses a Non-Max Suppression for box pruning. - __feature_pyramid__: (Optional) A `keras.layers.Layer` that produces a list of 4D feature maps (batch dimension included) when called on the pyramid-level outputs of the `backbone`. If not provided, the reference implementation from the paper will be used. - __classification_head__: (Optional) A `keras.Layer` that performs classification of the bounding boxes. If not provided, a simple ConvNet with 3 layers will be used. - __box_head__: (Optional) A `keras.Layer` that performs regression of the bounding boxes. If not provided, a simple ConvNet with 3 layers will be used. ## Example Usage ## Pretrained RetinaNet model ``` object_detector = keras_hub.models.ImageObjectDetector.from_preset( "retinanet_resnet50_fpn_coco" ) input_data = np.random.uniform(0, 1, size=(2, 224, 224, 3)) object_detector(input_data) ``` ## Fine-tune the pre-trained model ```python3 backbone = keras_hub.models.Backbone.from_preset( "retinanet_resnet50_fpn_coco" ) preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor.from_preset( "retinanet_resnet50_fpn_coco" ) model = RetinaNetObjectDetector( backbone=backbone, num_classes=len(CLASSES), preprocessor=preprocessor ) ``` ## Custom training the model ```python3 image_converter = keras_hub.layers.RetinaNetImageConverter( scale=1/255 ) preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor( image_converter=image_converter ) # Load a pre-trained ResNet50 model. # This will serve as the base for extracting image features. image_encoder = keras_hub.models.Backbone.from_preset( "resnet_50_imagenet" ) # Build the RetinaNet Feature Pyramid Network (FPN) on top of the ResNet50 # backbone. The FPN creates multi-scale feature maps for better object detection # at different sizes. backbone = keras_hub.models.RetinaNetBackbone( image_encoder=image_encoder, min_level=3, max_level=5, use_p5=False ) model = RetinaNetObjectDetector( backbone=backbone, num_classes=len(CLASSES), preprocessor=preprocessor ) ``` ## Example Usage with Hugging Face URI ## Pretrained RetinaNet model ``` object_detector = keras_hub.models.ImageObjectDetector.from_preset( "hf://keras/retinanet_resnet50_fpn_coco" ) input_data = np.random.uniform(0, 1, size=(2, 224, 224, 3)) object_detector(input_data) ``` ## Fine-tune the pre-trained model ```python3 backbone = keras_hub.models.Backbone.from_preset( "hf://keras/retinanet_resnet50_fpn_coco" ) preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor.from_preset( "hf://keras/retinanet_resnet50_fpn_coco" ) model = RetinaNetObjectDetector( backbone=backbone, num_classes=len(CLASSES), preprocessor=preprocessor ) ``` ## Custom training the model ```python3 image_converter = keras_hub.layers.RetinaNetImageConverter( scale=1/255 ) preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor( image_converter=image_converter ) # Load a pre-trained ResNet50 model. # This will serve as the base for extracting image features. image_encoder = keras_hub.models.Backbone.from_preset( "resnet_50_imagenet" ) # Build the RetinaNet Feature Pyramid Network (FPN) on top of the ResNet50 # backbone. The FPN creates multi-scale feature maps for better object detection # at different sizes. backbone = keras_hub.models.RetinaNetBackbone( image_encoder=image_encoder, min_level=3, max_level=5, use_p5=False ) model = RetinaNetObjectDetector( backbone=backbone, num_classes=len(CLASSES), preprocessor=preprocessor ) ```