suous commited on
Commit
4c401bf
·
verified ·
1 Parent(s): 43142c4

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +125 -3
README.md CHANGED
@@ -1,9 +1,131 @@
1
  ---
 
 
 
 
 
 
 
 
 
2
  tags:
 
3
  - image-classification
 
4
  - timm
5
  - transformers
6
- library_name: timm
7
- license: apache-2.0
8
  ---
9
- # Model card for recnext_a1.base_300e_in1k
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ datasets:
3
+ - imagenet-1k
4
+ language: en
5
+ library_name: timm
6
+ license: apache-2.0
7
+ metrics:
8
+ - accuracy
9
+ model_name: recnext_a1
10
+ pipeline_tag: image-classification
11
  tags:
12
+ - vision
13
  - image-classification
14
+ - pytorch
15
  - timm
16
  - transformers
 
 
17
  ---
18
+
19
+ # Model Card for RecNeXt-A1
20
+
21
+ [![license](https://img.shields.io/github/license/suous/RecNeXt)](https://github.com/suous/RecNeXt/blob/main/LICENSE)
22
+ [![arXiv](https://img.shields.io/badge/arXiv-2406.16004-red)](https://arxiv.org/abs/2412.19628)
23
+
24
+ <div style="display: flex; justify-content: space-between;">
25
+ <img src="https://raw.githubusercontent.com/suous/RecNeXt/refs/heads/main/figures/RecConvA.png" alt="RecConvA" style="width: 52%;">
26
+ <img src="https://raw.githubusercontent.com/suous/RecNeXt/refs/heads/main/figures/code.png" alt="code" style="width: 46%;">
27
+ </div>
28
+
29
+ ## Model Details
30
+
31
+ - **Model Type**: Image Classification / Feature Extraction
32
+ - **Model Series**: A
33
+ - **Model Stats**:
34
+ - **Parameters**: 5.9M
35
+ - **MACs**: 0.9G
36
+ - **Latency**: 1.9ms (iPhone 13, iOS 18)
37
+ - **Image Size**: 224x224
38
+
39
+ - **Architecture Configuration**:
40
+ - **Embedding Dimensions**: (48, 96, 192, 384)
41
+ - **Depths**: (3, 3, 15, 2)
42
+ - **MLP Ratio**: (2, 2, 2, 2)
43
+
44
+ - **Paper**: [RecConv: Efficient Recursive Convolutions for Multi-Frequency Representations](https://arxiv.org/abs/2412.19628)
45
+
46
+ - **Code**: https://github.com/suous/RecNeXt
47
+
48
+ - **Dataset**: ImageNet-1K
49
+
50
+ ## Model Usage
51
+
52
+ ### Image Classification
53
+
54
+ ```python
55
+ from urllib.request import urlopen
56
+ from PIL import Image
57
+ import timm
58
+ import torch
59
+
60
+ img = Image.open(urlopen(
61
+ 'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
62
+ ))
63
+
64
+ model = timm.create_model('recnext_a1', pretrained=True, distillation=False)
65
+ model = model.eval()
66
+
67
+ # get model specific transforms (normalization, resize)
68
+ data_config = timm.data.resolve_model_data_config(model)
69
+ transforms = timm.data.create_transform(**data_config, is_training=False)
70
+
71
+ output = model(transforms(img).unsqueeze(0)) # unsqueeze single image into batch of 1
72
+
73
+ top5_probabilities, top5_class_indices = torch.topk(output.softmax(dim=1) * 100, k=5)
74
+ ```
75
+
76
+ ### Converting to Inference Mode
77
+
78
+ ```python
79
+ import utils
80
+
81
+ # Convert training-time model to inference structure, fuse batchnorms
82
+ utils.replace_batchnorm(model)
83
+ ```
84
+ ## Model Comparison
85
+
86
+ ### Classification
87
+
88
+ We introduce two series of models: the **A** series uses linear attention and nearest interpolation, while the **M** series employs convolution and bilinear interpolation for simplicity and broader hardware compatibility (e.g., to address suboptimal nearest interpolation support in some iOS versions).
89
+
90
+ > **dist**: distillation; **base**: without distillation (all models are trained over 300 epochs).
91
+
92
+ | model | top_1_accuracy | params | gmacs | npu_latency | cpu_latency | throughput | fused_weights | training_logs |
93
+ |-------|----------------|--------|-------|-------------|-------------|------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
94
+ | M0 | 74.7* \| 73.2 | 2.5M | 0.4 | 1.0ms | 189ms | 763 | [dist](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m0_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m0_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/logs/distill/recnext_m0_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/logs/normal/recnext_m0_without_distill_300e.txt) |
95
+ | M1 | 79.2* \| 78.0 | 5.2M | 0.9 | 1.4ms | 361ms | 384 | [dist](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m1_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m1_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/logs/distill/recnext_m1_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/logs/normal/recnext_m1_without_distill_300e.txt) |
96
+ | M2 | 80.3* \| 79.2 | 6.8M | 1.2 | 1.5ms | 431ms | 325 | [dist](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m2_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m2_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/logs/distill/recnext_m2_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/logs/normal/recnext_m2_without_distill_300e.txt) |
97
+ | M3 | 80.9* \| 79.6 | 8.2M | 1.4 | 1.6ms | 482ms | 314 | [dist](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m3_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m3_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/logs/distill/recnext_m3_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/logs/normal/recnext_m3_without_distill_300e.txt) |
98
+ | M4 | 82.5* \| 81.1 | 14.1M | 2.4 | 2.4ms | 843ms | 169 | [dist](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m4_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m4_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/logs/distill/recnext_m4_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/logs/normal/recnext_m4_without_distill_300e.txt) |
99
+ | M5 | 83.3* \| 81.6 | 22.9M | 4.7 | 3.4ms | 1487ms | 104 | [dist](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m5_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m5_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/logs/distill/recnext_m5_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/logs/normal/recnext_m5_without_distill_300e.txt) |
100
+ | A0 | 75.0* \| 73.6 | 2.8M | 0.4 | 1.4ms | 177ms | 4902 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a0_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a0_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/logs/distill/recnext_a0_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/logs/normal/recnext_a0_without_distill_300e.txt) |
101
+ | A1 | 79.6* \| 78.3 | 5.9M | 0.9 | 1.9ms | 334ms | 2746 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a1_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a1_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/logs/distill/recnext_a1_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/logs/normal/recnext_a1_without_distill_300e.txt) |
102
+ | A2 | 80.8* \| 79.6 | 7.9M | 1.2 | 2.2ms | 413ms | 2327 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a2_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a2_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/logs/distill/recnext_a2_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/logs/normal/recnext_a2_without_distill_300e.txt) |
103
+ | A3 | 81.1* \| 80.1 | 9.0M | 1.4 | 2.4ms | 447ms | 2206 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a3_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a3_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/logs/distill/recnext_a3_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/logs/normal/recnext_a3_without_distill_300e.txt) |
104
+ | A4 | 82.5* \| 81.6 | 15.8M | 2.4 | 3.6ms | 764ms | 1265 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a4_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a4_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/logs/distill/recnext_a4_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/logs/normal/recnext_a4_without_distill_300e.txt) |
105
+ | A5 | 83.5* \| 83.1 | 25.7M | 4.7 | 5.6ms | 1376ms | 721 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a5_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a5_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/logs/distill/recnext_a5_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/logs/normal/recnext_a5_without_distill_300e.txt) |
106
+
107
+ ### Comparison with [LSNet](https://github.com/jameslahm/lsnet)
108
+
109
+ | model | top_1_accuracy | params | gmacs | npu_latency | cpu_latency | throughput | fused_weights | training_logs |
110
+ |-------|----------------|--------|-------|-------------|-------------|------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
111
+ | T | 76.6* \| 75.1 | 12.1M | 0.3 | 1.8ms | 109ms | 14181 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_t_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_t_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/lsnet/logs/distill/recnext_t_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/lsnet/logs/normal/recnext_t_without_distill_300e.txt) |
112
+ | S | 79.6* \| 78.3 | 15.8M | 0.7 | 2.0ms | 188ms | 8234 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_s_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_s_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/lsnet/logs/distill/recnext_s_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/lsnet/logs/normal/recnext_s_without_distill_300e.txt) |
113
+ | B | 81.4* \| 80.3 | 19.3M | 1.1 | 2.5ms | 290ms | 4385 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_b_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_b_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/lsnet/logs/distill/recnext_b_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/lsnet/logs/normal/recnext_b_without_distill_300e.txt) |
114
+
115
+ > The NPU latency is measured on an iPhone 13 with models compiled by Core ML Tools.
116
+ > The CPU latency is accessed on a Quad-core ARM Cortex-A57 processor in ONNX format.
117
+ > And the throughput is tested on an Nvidia RTX3090 with maximum power-of-two batch size that fits in memory.
118
+
119
+
120
+ ## Citation
121
+
122
+ ```BibTeX
123
+ @misc{zhao2024recnext,
124
+ title={RecConv: Efficient Recursive Convolutions for Multi-Frequency Representations},
125
+ author={Mingshu Zhao and Yi Luo and Yong Ouyang},
126
+ year={2024},
127
+ eprint={2412.19628},
128
+ archivePrefix={arXiv},
129
+ primaryClass={cs.CV}
130
+ }
131
+ ```