File size: 3,334 Bytes

96f6ede
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a030bb1
96f6ede
 
a030bb1
96f6ede
 
a030bb1
96f6ede
 
 
a030bb1
96f6ede
 
 
 
 
 
 
 
a030bb1
96f6ede
 
a030bb1
96f6ede
 
a030bb1
96f6ede
 
 
a030bb1
96f6ede
 
 
 
 
 
 
 
a030bb1
96f6ede
 
a030bb1
96f6ede
 
a030bb1
96f6ede
 
 
 
 
 
a030bb1
 
 
 
96f6ede

---
tags:
- image-classification
- birder
- pytorch
library_name: birder
license: apache-2.0
---

# Model Card for mobilenet_v4_l_eu-common

A MobileNet v4 image classification model. This model was trained on the `eu-common` dataset containing common European bird species.

The species list is derived from the Collins bird guide [^1].

[^1]: Svensson, L., Mullarney, K., & Zetterström, D. (2022). Collins bird guide (3rd ed.). London, England: William Collins.

## Model Details

- **Model Type:** Image classification and detection backbone
- **Model Stats:**
    - Params (M): 32.2
    - Input image size: 384 x 384
- **Dataset:** eu-common (707 classes)

- **Papers:**
    - MobileNetV4 -- Universal Models for the Mobile Ecosystem: <https://arxiv.org/abs/2404.10518>

## Model Usage

### Image Classification

```python
import birder
from birder.inference.classification import infer_image

(net, model_info) = birder.load_pretrained_model("mobilenet_v4_l_eu-common", inference=True)

# Get the image size the model was trained on
size = birder.get_size_from_signature(model_info.signature)

# Create an inference transform
transform = birder.classification_transform(size, model_info.rgb_stats)

image = "path/to/image.jpeg"  # or a PIL image, must be loaded in RGB format
(out, _) = infer_image(net, image, transform)
# out is a NumPy array with shape of (1, 707), representing class probabilities.
```

### Image Embeddings

```python
import birder
from birder.inference.classification import infer_image

(net, model_info) = birder.load_pretrained_model("mobilenet_v4_l_eu-common", inference=True)

# Get the image size the model was trained on
size = birder.get_size_from_signature(model_info.signature)

# Create an inference transform
transform = birder.classification_transform(size, model_info.rgb_stats)

image = "path/to/image.jpeg"  # or a PIL image
(out, embedding) = infer_image(net, image, transform, return_embedding=True)
# embedding is a NumPy array with shape of (1, 1280)
```

### Detection Feature Map

```python
from PIL import Image
import birder

(net, model_info) = birder.load_pretrained_model("mobilenet_v4_l_eu-common", inference=True)

# Get the image size the model was trained on
size = birder.get_size_from_signature(model_info.signature)

# Create an inference transform
transform = birder.classification_transform(size, model_info.rgb_stats)

image = Image.open("path/to/image.jpeg")
features = net.detection_features(transform(image).unsqueeze(0))
# features is a dict (stage name -> torch.Tensor)
print([(k, v.size()) for k, v in features.items()])
# Output example:
# [('stage1', torch.Size([1, 48, 96, 96])),
#  ('stage2', torch.Size([1, 96, 48, 48])),
#  ('stage3', torch.Size([1, 192, 24, 24])),
#  ('stage4', torch.Size([1, 512, 12, 12]))]
```

## Citation

```bibtex
@misc{qin2024mobilenetv4universalmodels,
      title={MobileNetV4 -- Universal Models for the Mobile Ecosystem},
      author={Danfeng Qin and Chas Leichner and Manolis Delakis and Marco Fornoni and Shixin Luo and Fan Yang and Weijun Wang and Colby Banbury and Chengxi Ye and Berkin Akin and Vaibhav Aggarwal and Tenghui Zhu and Daniele Moro and Andrew Howard},
      year={2024},
      eprint={2404.10518},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2404.10518},
}
```