hassonofer commited on
Commit
2d6f610
·
verified ·
1 Parent(s): 2a65938

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +103 -3
README.md CHANGED
@@ -1,3 +1,103 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - image-classification
4
+ - birder
5
+ library_name: birder
6
+ license: apache-2.0
7
+ ---
8
+
9
+ # Model Card for biformer_s_il-all
10
+
11
+ A BiFormer image classification model. This model was trained on the `il-all` dataset (all the relevant bird species found in Israel inc. rarities).
12
+
13
+ The species list is derived from data available at <https://www.israbirding.com/checklist/>.
14
+
15
+ ## Model Details
16
+
17
+ - **Model Type:** Image classification and detection backbone
18
+ - **Model Stats:**
19
+ - Params (M): 25.3
20
+ - Input image size: 384 x 384
21
+ - **Dataset:** il-all (550 classes)
22
+
23
+ - **Papers:**
24
+ - BiFormer: Vision Transformer with Bi-Level Routing Attention: <https://arxiv.org/abs/2303.08810>
25
+
26
+ ## Model Usage
27
+
28
+ ### Image Classification
29
+
30
+ ```python
31
+ import birder
32
+ from birder.inference.classification import infer_image
33
+
34
+ (net, class_to_idx, signature, rgb_stats) = birder.load_pretrained_model("biformer_s_il-all", inference=True)
35
+
36
+ # Get the image size the model was trained on
37
+ size = birder.get_size_from_signature(signature)
38
+
39
+ # Create an inference transform
40
+ transform = birder.classification_transform(size, rgb_stats)
41
+
42
+ image = "path/to/image.jpeg" # or a PIL image
43
+ (out, _) = infer_image(net, image, transform)
44
+ # out is a NumPy array with shape of (1, num_classes)
45
+ ```
46
+
47
+ ### Image Embeddings
48
+
49
+ ```python
50
+ import birder
51
+ from birder.inference.classification import infer_image
52
+
53
+ (net, class_to_idx, signature, rgb_stats) = birder.load_pretrained_model("biformer_s_il-all", inference=True)
54
+
55
+ # Get the image size the model was trained on
56
+ size = birder.get_size_from_signature(signature)
57
+
58
+ # Create an inference transform
59
+ transform = birder.classification_transform(size, rgb_stats)
60
+
61
+ image = "path/to/image.jpeg" # or a PIL image
62
+ (out, embedding) = infer_image(net, image, transform, return_embedding=True)
63
+ # embedding is a NumPy array with shape of (1, embedding_size)
64
+ ```
65
+
66
+ ### Detection Feature Map
67
+
68
+ ```python
69
+ from PIL import Image
70
+ import birder
71
+
72
+ (net, class_to_idx, signature, rgb_stats) = birder.load_pretrained_model("biformer_s_il-all", inference=True)
73
+
74
+ # Get the image size the model was trained on
75
+ size = birder.get_size_from_signature(signature)
76
+
77
+ # Create an inference transform
78
+ transform = birder.classification_transform(size, rgb_stats)
79
+
80
+ image = Image.open("path/to/image.jpeg")
81
+ features = net.detection_features(transform(image).unsqueeze(0))
82
+ # features is a dict (stage name -> torch.Tensor)
83
+ print([(k, v.size()) for k, v in features.items()])
84
+ # Output example:
85
+ # [('stage1', torch.Size([1, 96, 96, 96])),
86
+ # ('stage2', torch.Size([1, 192, 48, 48])),
87
+ # ('stage3', torch.Size([1, 384, 24, 24])),
88
+ # ('stage4', torch.Size([1, 768, 12, 12]))]
89
+ ```
90
+
91
+ ## Citation
92
+
93
+ ```bibtex
94
+ @misc{zhu2023biformervisiontransformerbilevel,
95
+ title={BiFormer: Vision Transformer with Bi-Level Routing Attention},
96
+ author={Lei Zhu and Xinjiang Wang and Zhanghan Ke and Wayne Zhang and Rynson Lau},
97
+ year={2023},
98
+ eprint={2303.08810},
99
+ archivePrefix={arXiv},
100
+ primaryClass={cs.CV},
101
+ url={https://arxiv.org/abs/2303.08810},
102
+ }
103
+ ```