suous commited on
Commit
5519f2a
Β·
verified Β·
1 Parent(s): 2dc1957

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +94 -155
README.md CHANGED
@@ -19,10 +19,10 @@ tags:
19
  # Model Card for RecNeXt-A1
20
 
21
  ## Abstract
22
- Recent advances in vision transformers (ViTs) have demonstrated the advantage of global modeling capabilities, prompting widespread integration of large-kernel convolutions for enlarging the effective receptive field (ERF). However, the quadratic scaling of parameter count and computational complexity (FLOPs) with respect to kernel size poses significant efficiency and optimization challenges. This paper introduces RecConv, a recursive decomposition strategy that efficiently constructs multi-frequency representations using small-kernel convolutions. RecConv establishes a linear relationship between parameter growth and decomposing levels which determines the effective receptive field $k\times 2^\ell$ for a base kernel $k$ and $\ell$ levels of decomposition, while maintaining constant FLOPs regardless of the ERF expansion. Specifically, RecConv achieves a parameter expansion of only $\ell+2$ times and a maximum FLOPs increase of $5/3$ times, compared to the exponential growth ($4^\ell$) of standard and depthwise convolutions. RecNeXt-M3 outperforms RepViT-M1.1 by 1.9 $AP^{box}$ on COCO with similar FLOPs. This innovation provides a promising avenue towards designing efficient and compact networks across various modalities. Codes and models can be found at this https URL .
23
 
24
  [![license](https://img.shields.io/github/license/suous/RecNeXt)](https://github.com/suous/RecNeXt/blob/main/LICENSE)
25
- [![arXiv](https://img.shields.io/badge/arXiv-2406.16004-red)](https://arxiv.org/abs/2412.19628)
26
 
27
  <div style="display: flex; justify-content: space-between;">
28
  <img src="https://raw.githubusercontent.com/suous/RecNeXt/refs/heads/main/figures/RecConvA.png" alt="RecConvA" style="width: 52%;">
@@ -37,6 +37,7 @@ Recent advances in vision transformers (ViTs) have demonstrated the advantage of
37
  - **Parameters**: 5.9M
38
  - **MACs**: 0.9G
39
  - **Latency**: 1.9ms (iPhone 13, iOS 18)
 
40
  - **Image Size**: 224x224
41
 
42
  - **Architecture Configuration**:
@@ -95,7 +96,6 @@ import utils
95
  # Convert training-time model to inference structure, fuse batchnorms
96
  utils.replace_batchnorm(model)
97
  ```
98
-
99
  ## Model Comparison
100
 
101
  ### Classification
@@ -104,35 +104,50 @@ We introduce two series of models: the **A** series uses linear attention and ne
104
 
105
  > **dist**: distillation; **base**: without distillation (all models are trained over 300 epochs).
106
 
107
- | model | top_1_accuracy | params | gmacs | npu_latency | cpu_latency | throughput | fused_weights | training_logs |
108
- |-------|----------------|--------|-------|-------------|-------------|------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
109
- | M0 | 74.7* \| 73.2 | 2.5M | 0.4 | 1.0ms | 189ms | 763 | [dist](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m0_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m0_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/logs/distill/recnext_m0_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/logs/normal/recnext_m0_without_distill_300e.txt) |
110
- | M1 | 79.2* \| 78.0 | 5.2M | 0.9 | 1.4ms | 361ms | 384 | [dist](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m1_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m1_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/logs/distill/recnext_m1_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/logs/normal/recnext_m1_without_distill_300e.txt) |
111
- | M2 | 80.3* \| 79.2 | 6.8M | 1.2 | 1.5ms | 431ms | 325 | [dist](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m2_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m2_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/logs/distill/recnext_m2_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/logs/normal/recnext_m2_without_distill_300e.txt) |
112
- | M3 | 80.9* \| 79.6 | 8.2M | 1.4 | 1.6ms | 482ms | 314 | [dist](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m3_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m3_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/logs/distill/recnext_m3_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/logs/normal/recnext_m3_without_distill_300e.txt) |
113
- | M4 | 82.5* \| 81.1 | 14.1M | 2.4 | 2.4ms | 843ms | 169 | [dist](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m4_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m4_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/logs/distill/recnext_m4_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/logs/normal/recnext_m4_without_distill_300e.txt) |
114
- | M5 | 83.3* \| 81.6 | 22.9M | 4.7 | 3.4ms | 1487ms | 104 | [dist](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m5_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m5_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/logs/distill/recnext_m5_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/logs/normal/recnext_m5_without_distill_300e.txt) |
115
- | A0 | 75.0* \| 73.6 | 2.8M | 0.4 | 1.4ms | 177ms | 4902 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a0_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a0_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/logs/distill/recnext_a0_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/logs/normal/recnext_a0_without_distill_300e.txt) |
116
- | A1 | 79.6* \| 78.3 | 5.9M | 0.9 | 1.9ms | 334ms | 2746 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a1_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a1_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/logs/distill/recnext_a1_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/logs/normal/recnext_a1_without_distill_300e.txt) |
117
- | A2 | 80.8* \| 79.6 | 7.9M | 1.2 | 2.2ms | 413ms | 2327 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a2_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a2_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/logs/distill/recnext_a2_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/logs/normal/recnext_a2_without_distill_300e.txt) |
118
- | A3 | 81.1* \| 80.1 | 9.0M | 1.4 | 2.4ms | 447ms | 2206 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a3_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a3_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/logs/distill/recnext_a3_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/logs/normal/recnext_a3_without_distill_300e.txt) |
119
- | A4 | 82.5* \| 81.6 | 15.8M | 2.4 | 3.6ms | 764ms | 1265 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a4_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a4_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/logs/distill/recnext_a4_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/logs/normal/recnext_a4_without_distill_300e.txt) |
120
- | A5 | 83.5* \| 83.1 | 25.7M | 4.7 | 5.6ms | 1376ms | 721 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a5_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a5_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/logs/distill/recnext_a5_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/logs/normal/recnext_a5_without_distill_300e.txt) |
121
 
122
  ### Comparison with [LSNet](https://github.com/jameslahm/lsnet)
123
 
124
- | model | top_1_accuracy | params | gmacs | npu_latency | cpu_latency | throughput | fused_weights | training_logs |
125
- |-------|----------------|--------|-------|-------------|-------------|------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
126
- | T | 76.6* \| 75.1 | 12.1M | 0.3 | 1.8ms | 109ms | 14181 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_t_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_t_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/lsnet/logs/distill/recnext_t_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/lsnet/logs/normal/recnext_t_without_distill_300e.txt) |
127
- | S | 79.6* \| 78.3 | 15.8M | 0.7 | 2.0ms | 188ms | 8234 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_s_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_s_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/lsnet/logs/distill/recnext_s_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/lsnet/logs/normal/recnext_s_without_distill_300e.txt) |
128
- | B | 81.4* \| 80.3 | 19.3M | 1.1 | 2.5ms | 290ms | 4385 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_b_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_b_without_distill_300e_fused.pt) | [dist](https://github.com/suous/RecNeXt/blob/main/lsnet/logs/distill/recnext_b_distill_300e.txt) \| [base](https://github.com/suous/RecNeXt/blob/main/lsnet/logs/normal/recnext_b_without_distill_300e.txt) |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
129
 
130
  > The NPU latency is measured on an iPhone 13 with models compiled by Core ML Tools.
131
  > The CPU latency is accessed on a Quad-core ARM Cortex-A57 processor in ONNX format.
132
  > And the throughput is tested on an Nvidia RTX3090 with maximum power-of-two batch size that fits in memory.
133
 
134
 
135
- ## Latency Measurement
136
 
137
  The latency reported in RecNeXt for iPhone 13 (iOS 18) uses the benchmark tool from [XCode 14](https://developer.apple.com/videos/play/wwdc2022/10027/).
138
 
@@ -220,6 +235,27 @@ RecNeXt-A5
220
  <img src="https://raw.githubusercontent.com/suous/RecNeXt/main/figures/latency/recnext_a5_224x224.png" alt="recnext_a5">
221
  </details>
222
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
223
  Tips: export the model to Core ML model
224
  ```
225
  python export_coreml.py --model recnext_m1 --ckpt pretrain/recnext_m1_distill_300e.pth
@@ -232,7 +268,7 @@ python speed_gpu.py --model recnext_m1
232
  ## ImageNet (Training and Evaluation)
233
 
234
  ### Prerequisites
235
- `conda` virtual environment is recommended.
236
  ```
237
  conda create -n recnext python=3.8
238
  pip install -r requirements.txt
@@ -271,9 +307,9 @@ To train RecNeXt-M1 on an 8-GPU machine:
271
  ```
272
  python -m torch.distributed.launch --nproc_per_node=8 --master_port 12346 --use_env main.py --model recnext_m1 --data-path ~/imagenet --dist-eval
273
  ```
274
- Tips: specify your data path and model name!
275
 
276
- ### Testing
277
  For example, to test RecNeXt-M1:
278
  ```
279
  python main.py --eval --model recnext_m1 --resume pretrain/recnext_m1_distill_300e.pth --data-path ~/imagenet
@@ -309,67 +345,27 @@ python publish.py --model_name recnext_m1 --checkpoint_path pretrain/checkpoint_
309
  ```
310
 
311
  ## Downstream Tasks
312
- [Object Detection and Instance Segmentation](detection/README.md)<br>
313
-
314
- | Model | $AP^b$ | $AP_{50}^b$ | $AP_{75}^b$ | $AP^m$ | $AP_{50}^m$ | $AP_{75}^m$ | Latency | Ckpt | Log |
315
- |:-----------|:------:|:-----------:|:-----------:|:------:|:-----------:|:-----------:|:-------:|:---------------------------------------------------------------------------------:|:-------------------------------------------:|
316
- | RecNeXt-M3 | 41.7 | 63.4 | 45.4 | 38.6 | 60.5 | 41.4 | 5.2ms | [M3](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m3_coco.pth) | [M3](./detection/logs/recnext_m3_coco.json) |
317
- | RecNeXt-M4 | 43.5 | 64.9 | 47.7 | 39.7 | 62.1 | 42.4 | 7.6ms | [M4](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m4_coco.pth) | [M4](./detection/logs/recnext_m4_coco.json) |
318
- | RecNeXt-M5 | 44.6 | 66.3 | 49.0 | 40.6 | 63.5 | 43.5 | 12.4ms | [M5](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m5_coco.pth) | [M5](./detection/logs/recnext_m5_coco.json) |
319
- | RecNeXt-A3 | 42.1 | 64.1 | 46.2 | 38.8 | 61.1 | 41.6 | 8.3ms | [A3](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a3_coco.pth) | [A3](./detection/logs/recnext_a3_coco.json) |
320
- | RecNeXt-A4 | 43.5 | 65.4 | 47.6 | 39.8 | 62.4 | 42.9 | 14.0ms | [A4](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a4_coco.pth) | [A4](./detection/logs/recnext_a4_coco.json) |
321
- | A5 | 44.4 | 66.3 | 48.9 | 40.3 | 63.3 | 43.4 | 25.3ms | [A5](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a5_coco.pth) | [A5](./detection/logs/recnext_a5_coco.json) |
322
- ```bash
323
- # this script is used to validate the detection results
324
- fd json detection/logs -x sh -c 'printf "%.1f %s
325
- " "$(tail -n +2 {} | jq -s "map(.bbox_mAP) | max * 100")" "{}"' | sort -k2
326
- ```
327
-
328
- <details>
329
- <summary>
330
- <span>output</span>
331
- </summary>
332
-
333
- ```
334
- 42.1 detection/logs/recnext_a3_coco.json
335
- 43.5 detection/logs/recnext_a4_coco.json
336
- 44.4 detection/logs/recnext_a5_coco.json
337
- 41.7 detection/logs/recnext_m3_coco.json
338
- 43.5 detection/logs/recnext_m4_coco.json
339
- 44.6 detection/logs/recnext_m5_coco.json
340
- ```
341
- </details>
342
-
343
- [Semantic Segmentation](segmentation/README.md)
344
-
345
- | Model | mIoU | Latency | Ckpt | Log |
346
- |:-----------|:----:|:-------:|:-----------------------------------------------------------------------------------:|:------------------------------------------------:|
347
- | RecNeXt-M3 | 41.0 | 5.6ms | [M3](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m3_ade20k.pth) | [M3](./segmentation/logs/recnext_m3_ade20k.json) |
348
- | RecNeXt-M4 | 43.6 | 7.2ms | [M4](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m4_ade20k.pth) | [M4](./segmentation/logs/recnext_m4_ade20k.json) |
349
- | RecNeXt-M5 | 46.0 | 12.4ms | [M5](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m5_ade20k.pth) | [M5](./segmentation/logs/recnext_m5_ade20k.json) |
350
- | RecNeXt-A3 | 41.9 | 8.4ms | [A3](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a3_ade20k.pth) | [A3](./segmentation/logs/recnext_a3_ade20k.json) |
351
- | RecNeXt-A4 | 43.0 | 14.0ms | [A4](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a4_ade20k.pth) | [A4](./segmentation/logs/recnext_a4_ade20k.json) |
352
- | A5 | 46.5 | 25.3ms | [A5](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a5_ade20k.pth) | [A5](./segmentation/logs/recnext_a5_ade20k.json) |
353
- ```bash
354
- # this script is used to validate the segmentation results
355
- fd json segmentation/logs -x sh -c 'printf "%.1f %s
356
- " "$(tail -n +2 {} | jq -s "map(.mIoU) | max * 100")" "{}"' | sort -k2
357
- ```
358
-
359
- <details>
360
- <summary>
361
- <span>output</span>
362
- </summary>
363
-
364
- ```
365
- 41.9 segmentation/logs/recnext_a3_ade20k.json
366
- 43.0 segmentation/logs/recnext_a4_ade20k.json
367
- 46.5 segmentation/logs/recnext_a5_ade20k.json
368
- 41.0 segmentation/logs/recnext_m3_ade20k.json
369
- 43.6 segmentation/logs/recnext_m4_ade20k.json
370
- 46.0 segmentation/logs/recnext_m5_ade20k.json
371
- ```
372
- </details>
373
 
374
  ## Ablation Study
375
 
@@ -407,40 +403,6 @@ logs/ablation
407
  β”œβ”€β”€ <a style="text-decoration:none" href="https://raw.githubusercontent.com/suous/RecNeXt/main/logs/ablation/384/recnext_m1_120e_384x384_rec_convtrans_7x7_group_7791.txt">recnext_m1_120e_384x384_rec_convtrans_7x7_group_7791.txt</a>
408
  └── <a style="text-decoration:none" href="https://raw.githubusercontent.com/suous/RecNeXt/main/logs/ablation/384/recnext_m1_120e_384x384_rec_convtrans_7x7_split_7683.txt">recnext_m1_120e_384x384_rec_convtrans_7x7_split_7683.txt</a>
409
  </pre>
410
-
411
- ```bash
412
- # this script is used to validate the ablation results
413
- fd txt logs/ablation -x sh -c 'printf "%.2f %s
414
- " "$(jq -s "map(.test_acc1) | max" {})" "{}"' | sort -k2
415
- ```
416
-
417
- <details>
418
- <summary>
419
- <span>output</span>
420
- </summary>
421
-
422
- ```
423
- 74.64 logs/ablation/224/recnext_m1_120e_224x224_3x3_7464.txt
424
- 75.52 logs/ablation/224/recnext_m1_120e_224x224_7x7_7552.txt
425
- 75.41 logs/ablation/224/recnext_m1_120e_224x224_bxb_7541.txt
426
- 75.48 logs/ablation/224/recnext_m1_120e_224x224_rec_3x3_7548.txt
427
- 76.03 logs/ablation/224/recnext_m1_120e_224x224_rec_5x5_7603.txt
428
- 75.67 logs/ablation/224/recnext_m1_120e_224x224_rec_7x7_7567.txt
429
- 75.71 logs/ablation/224/recnext_m1_120e_224x224_rec_7x7_nearest_7571.txt
430
- 75.93 logs/ablation/224/recnext_m1_120e_224x224_rec_7x7_nearest_ssm_7593.txt
431
- 75.48 logs/ablation/224/recnext_m1_120e_224x224_rec_7x7_unpool_7548.txt
432
- 76.35 logs/ablation/384/recnext_m1_120e_384x384_3x3_7635.txt
433
- 77.42 logs/ablation/384/recnext_m1_120e_384x384_7x7_7742.txt
434
- 78.00 logs/ablation/384/recnext_m1_120e_384x384_bxb_7800.txt
435
- 77.72 logs/ablation/384/recnext_m1_120e_384x384_rec_3x3_7772.txt
436
- 78.11 logs/ablation/384/recnext_m1_120e_384x384_rec_5x5_7811.txt
437
- 78.03 logs/ablation/384/recnext_m1_120e_384x384_rec_7x7_7803.txt
438
- 77.26 logs/ablation/384/recnext_m1_120e_384x384_rec_convtrans_3x3_basic_7726.txt
439
- 77.87 logs/ablation/384/recnext_m1_120e_384x384_rec_convtrans_5x5_basic_7787.txt
440
- 78.24 logs/ablation/384/recnext_m1_120e_384x384_rec_convtrans_7x7_basic_7824.txt
441
- 77.91 logs/ablation/384/recnext_m1_120e_384x384_rec_convtrans_7x7_group_7791.txt
442
- 76.84 logs/ablation/384/recnext_m1_120e_384x384_rec_convtrans_7x7_split_7683.txt
443
- ```
444
  </details>
445
 
446
  <details>
@@ -463,8 +425,8 @@ class RecConv2d(nn.Module):
463
  'bias': bias
464
  }
465
  self.n = nn.Conv2d(stride=2, **kwargs)
466
- self.a = nn.Conv2d(**kwargs)
467
- self.b = nn.Conv2d(**kwargs) if level >1 else None
468
  self.c = nn.Conv2d(**kwargs)
469
  self.d = nn.Conv2d(**kwargs)
470
 
@@ -588,9 +550,14 @@ class RecConv2d(nn.Module):
588
 
589
  ### RecConv Beyond
590
 
591
- We apply RecConv to [MLLA](https://github.com/LeapLabTHU/MLLA) small variants, replacing linear attention and downsampling layers.
592
  Result in higher throughput and less training memory usage.
593
 
 
 
 
 
 
594
  <pre>
595
  mlla/logs
596
  β”œβ”€β”€ 1_mlla_nano
@@ -606,32 +573,6 @@ mlla/logs
606
  β”œβ”€β”€ <a style="text-decoration:none" href="https://raw.githubusercontent.com/suous/RecNeXt/main/mlla/logs/2_mlla_mini/04_recattn_nearest_interp.txt">04_recattn_nearest_interp.txt</a>
607
  └── <a style="text-decoration:none" href="https://raw.githubusercontent.com/suous/RecNeXt/main/mlla/logs/2_mlla_mini/05_recattn_nearest_interp_simplify.txt">05_recattn_nearest_interp_simplify.txt</a>
608
  </pre>
609
-
610
-
611
- ```bash
612
- # this script is used to validate the ablation results
613
- fd txt mlla/logs -x sh -c 'printf "%.2f %s
614
- " "$(rg -N -I -U -o "EPOCH.*
615
- .*Acc@1 (\d+\.\d+)" -r "\$1" {} | sort -n | tail -1)" "{}"' | sort -k2
616
- ```
617
-
618
- <details>
619
- <summary>
620
- <span>output</span>
621
- </summary>
622
-
623
- ```
624
- 76.26 mlla/logs/1_mlla_nano/01_baseline.txt
625
- 77.09 mlla/logs/1_mlla_nano/02_recconv_5x5_conv_trans.txt
626
- 77.14 mlla/logs/1_mlla_nano/03_recconv_5x5_nearest_interp.txt
627
- 76.53 mlla/logs/1_mlla_nano/04_recattn_nearest_interp.txt
628
- 77.28 mlla/logs/1_mlla_nano/05_recattn_nearest_interp_simplify.txt
629
- 82.27 mlla/logs/2_mlla_mini/01_baseline.txt
630
- 82.06 mlla/logs/2_mlla_mini/02_recconv_5x5_conv_trans.txt
631
- 81.94 mlla/logs/2_mlla_mini/03_recconv_5x5_nearest_interp.txt
632
- 82.08 mlla/logs/2_mlla_mini/04_recattn_nearest_interp.txt
633
- 82.16 mlla/logs/2_mlla_mini/05_recattn_nearest_interp_simplify.txt
634
- ```
635
  </details>
636
 
637
  ## Limitations
@@ -642,16 +583,14 @@ fd txt mlla/logs -x sh -c 'printf "%.2f %s
642
 
643
  ## Acknowledgement
644
 
645
- Classification (ImageNet) code base is partly built with [LeViT](https://github.com/facebookresearch/LeViT), [PoolFormer](https://github.com/sail-sg/poolformer), [EfficientFormer](https://github.com/snap-research/EfficientFormer), [RepViT](https://github.com/THU-MIG/RepViT), and [MogaNet](https://github.com/Westlake-AI/MogaNet).
646
 
647
- The detection and segmentation pipeline is from [MMCV](https://github.com/open-mmlab/mmcv) ([MMDetection](https://github.com/open-mmlab/mmdetection) and [MMSegmentation](https://github.com/open-mmlab/mmsegmentation)).
648
 
649
- Thanks for the great implementations!
650
 
651
  ## Citation
652
 
653
- If our code or models help your work, please cite our papers and give us a star 🌟!
654
-
655
  ```BibTeX
656
  @misc{zhao2024recnext,
657
  title={RecConv: Efficient Recursive Convolutions for Multi-Frequency Representations},
 
19
  # Model Card for RecNeXt-A1
20
 
21
  ## Abstract
22
+ Recent advances in vision transformers (ViTs) have demonstrated the advantage of global modeling capabilities, prompting widespread integration of large-kernel convolutions for enlarging the effective receptive field (ERF). However, the quadratic scaling of parameter count and computational complexity (FLOPs) with respect to kernel size poses significant efficiency and optimization challenges. This paper introduces RecConv, a recursive decomposition strategy that efficiently constructs multi-frequency representations using small-kernel convolutions. RecConv establishes a linear relationship between parameter growth and decomposing levels which determines the effective receptive field $k\times 2^\ell$ for a base kernel $k$ and $\ell$ levels of decomposition, while maintaining constant FLOPs regardless of the ERF expansion. Specifically, RecConv achieves a parameter expansion of only $\ell+2$ times and a maximum FLOPs increase of $5/3$ times, compared to the exponential growth ($4^\ell$) of standard and depthwise convolutions. RecNeXt-M3 outperforms RepViT-M1.1 by 1.9 $AP^{box}$ on COCO with similar FLOPs. This innovation provides a promising avenue towards designing efficient and compact networks across various modalities. Codes and models can be found at https://github.com/suous/RecNeXt.
23
 
24
  [![license](https://img.shields.io/github/license/suous/RecNeXt)](https://github.com/suous/RecNeXt/blob/main/LICENSE)
25
+ [![arXiv](https://img.shields.io/badge/arXiv-2412.19628-red)](https://arxiv.org/abs/2412.19628)
26
 
27
  <div style="display: flex; justify-content: space-between;">
28
  <img src="https://raw.githubusercontent.com/suous/RecNeXt/refs/heads/main/figures/RecConvA.png" alt="RecConvA" style="width: 52%;">
 
37
  - **Parameters**: 5.9M
38
  - **MACs**: 0.9G
39
  - **Latency**: 1.9ms (iPhone 13, iOS 18)
40
+ - **Throughput**: 2730 (RTX 3090)
41
  - **Image Size**: 224x224
42
 
43
  - **Architecture Configuration**:
 
96
  # Convert training-time model to inference structure, fuse batchnorms
97
  utils.replace_batchnorm(model)
98
  ```
 
99
  ## Model Comparison
100
 
101
  ### Classification
 
104
 
105
  > **dist**: distillation; **base**: without distillation (all models are trained over 300 epochs).
106
 
107
+ | model | top_1_accuracy | params | gmacs | npu_latency | cpu_latency | throughput | fused_weights | training_logs |
108
+ |-------|----------------|--------|-------|-------------|-------------|------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
109
+ | M0 | 74.7* \| 73.2 | 2.5M | 0.4 | 1.0ms | 189ms | 750 | [dist](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m0_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m0_without_distill_300e_fused.pt) | [dist](https://raw.githubusercontent.com/suous/RecNeXt/main/logs/distill/recnext_m0_distill_300e.txt) \| [base](https://raw.githubusercontent.com/suous/RecNeXt/main/logs/normal/recnext_m0_without_distill_300e.txt) |
110
+ | M1 | 79.2* \| 78.0 | 5.2M | 0.9 | 1.4ms | 361ms | 384 | [dist](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m1_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m1_without_distill_300e_fused.pt) | [dist](https://raw.githubusercontent.com/suous/RecNeXt/main/logs/distill/recnext_m1_distill_300e.txt) \| [base](https://raw.githubusercontent.com/suous/RecNeXt/main/logs/normal/recnext_m1_without_distill_300e.txt) |
111
+ | M2 | 80.3* \| 79.2 | 6.8M | 1.2 | 1.5ms | 431ms | 325 | [dist](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m2_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m2_without_distill_300e_fused.pt) | [dist](https://raw.githubusercontent.com/suous/RecNeXt/main/logs/distill/recnext_m2_distill_300e.txt) \| [base](https://raw.githubusercontent.com/suous/RecNeXt/main/logs/normal/recnext_m2_without_distill_300e.txt) |
112
+ | M3 | 80.9* \| 79.6 | 8.2M | 1.4 | 1.6ms | 482ms | 314 | [dist](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m3_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m3_without_distill_300e_fused.pt) | [dist](https://raw.githubusercontent.com/suous/RecNeXt/main/logs/distill/recnext_m3_distill_300e.txt) \| [base](https://raw.githubusercontent.com/suous/RecNeXt/main/logs/normal/recnext_m3_without_distill_300e.txt) |
113
+ | M4 | 82.5* \| 81.4 | 14.1M | 2.4 | 2.4ms | 843ms | 169 | [dist](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m4_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m4_without_distill_300e_fused.pt) | [dist](https://raw.githubusercontent.com/suous/RecNeXt/main/logs/distill/recnext_m4_distill_300e.txt) \| [base](https://raw.githubusercontent.com/suous/RecNeXt/main/logs/normal/recnext_m4_without_distill_300e.txt) |
114
+ | M5 | 83.3* \| 82.9 | 22.9M | 4.7 | 3.4ms | 1487ms | 104 | [dist](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m5_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m5_without_distill_300e_fused.pt) | [dist](https://raw.githubusercontent.com/suous/RecNeXt/main/logs/distill/recnext_m5_distill_300e.txt) \| [base](https://raw.githubusercontent.com/suous/RecNeXt/main/logs/normal/recnext_m5_without_distill_300e.txt) |
115
+ | A0 | 75.0* \| 73.6 | 2.8M | 0.4 | 1.4ms | 177ms | 4891 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a0_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a0_without_distill_300e_fused.pt) | [dist](https://raw.githubusercontent.com/suous/RecNeXt/main/logs/distill/recnext_a0_distill_300e.txt) \| [base](https://raw.githubusercontent.com/suous/RecNeXt/main/logs/normal/recnext_a0_without_distill_300e.txt) |
116
+ | A1 | 79.6* \| 78.3 | 5.9M | 0.9 | 1.9ms | 334ms | 2730 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a1_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a1_without_distill_300e_fused.pt) | [dist](https://raw.githubusercontent.com/suous/RecNeXt/main/logs/distill/recnext_a1_distill_300e.txt) \| [base](https://raw.githubusercontent.com/suous/RecNeXt/main/logs/normal/recnext_a1_without_distill_300e.txt) |
117
+ | A2 | 80.8* \| 79.6 | 7.9M | 1.2 | 2.2ms | 413ms | 2331 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a2_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a2_without_distill_300e_fused.pt) | [dist](https://raw.githubusercontent.com/suous/RecNeXt/main/logs/distill/recnext_a2_distill_300e.txt) \| [base](https://raw.githubusercontent.com/suous/RecNeXt/main/logs/normal/recnext_a2_without_distill_300e.txt) |
118
+ | A3 | 81.1* \| 80.1 | 9.0M | 1.4 | 2.4ms | 447ms | 2151 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a3_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a3_without_distill_300e_fused.pt) | [dist](https://raw.githubusercontent.com/suous/RecNeXt/main/logs/distill/recnext_a3_distill_300e.txt) \| [base](https://raw.githubusercontent.com/suous/RecNeXt/main/logs/normal/recnext_a3_without_distill_300e.txt) |
119
+ | A4 | 82.5* \| 81.6 | 15.8M | 2.4 | 3.6ms | 764ms | 1265 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a4_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a4_without_distill_300e_fused.pt) | [dist](https://raw.githubusercontent.com/suous/RecNeXt/main/logs/distill/recnext_a4_distill_300e.txt) \| [base](https://raw.githubusercontent.com/suous/RecNeXt/main/logs/normal/recnext_a4_without_distill_300e.txt) |
120
+ | A5 | 83.5* \| 83.1 | 25.7M | 4.7 | 5.6ms | 1376ms | 733 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a5_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a5_without_distill_300e_fused.pt) | [dist](https://raw.githubusercontent.com/suous/RecNeXt/main/logs/distill/recnext_a5_distill_300e.txt) \| [base](https://raw.githubusercontent.com/suous/RecNeXt/main/logs/normal/recnext_a5_without_distill_300e.txt) |
121
 
122
  ### Comparison with [LSNet](https://github.com/jameslahm/lsnet)
123
 
124
+ We present a simple architecture, the overall design follows [LSNet](https://github.com/jameslahm/lsnet). This framework centers around sharing channel features from the previous layers.
125
+ Our motivation for doing so is to reduce the computational cost of token mixers and minimize feature redundancy in the final stage.
126
+
127
+ ![Architecture](https://raw.githubusercontent.com/suous/RecNeXt/refs/heads/main/lsnet/figures/architecture.png)
128
+
129
+ #### With **Shared-Channel Blocks**
130
+
131
+ | model | top_1_accuracy | params | gmacs | npu_latency | cpu_latency | throughput | fused_weights | training_logs |
132
+ |-------|----------------|--------|-------|-------------|-------------|------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
133
+ | T | 76.8 \| 75.2 | 12.1M | 0.3 | 1.8ms | 105ms | 13957 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_t_share_channel_distill_300e_fused.pt) \| [norm](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_t_share_channel_without_distill_300e_fused.pt) | [dist](https://raw.githubusercontent.com/suous/RecNeXt/main/lsnet/logs/distill/recnext_t_share_channel_distill_300e.txt) \| [norm](https://raw.githubusercontent.com/suous/RecNeXt/main/lsnet/logs/normal/recnext_t_share_channel_without_distill_300e.txt) |
134
+ | S | 79.5 \| 78.3 | 15.8M | 0.7 | 2.0ms | 182ms | 8034 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_s_share_channel_distill_300e_fused.pt) \| [norm](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_s_share_channel_without_distill_300e_fused.pt) | [dist](https://raw.githubusercontent.com/suous/RecNeXt/main/lsnet/logs/distill/recnext_s_share_channel_distill_300e.txt) \| [norm](https://raw.githubusercontent.com/suous/RecNeXt/main/lsnet/logs/normal/recnext_s_share_channel_without_distill_300e.txt) |
135
+ | B | 81.5 \| 80.3 | 19.2M | 1.1 | 2.5ms | 296ms | 4472 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_b_share_channel_distill_300e_fused.pt) \| [norm](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_b_share_channel_without_distill_300e_fused.pt) | [dist](https://raw.githubusercontent.com/suous/RecNeXt/main/lsnet/logs/distill/recnext_b_share_channel_distill_300e.txt) \| [norm](https://raw.githubusercontent.com/suous/RecNeXt/main/lsnet/logs/normal/recnext_b_share_channel_without_distill_300e.txt) |
136
+
137
+ #### Without **Shared-Channel Blocks**
138
+
139
+ | model | top_1_accuracy | params | gmacs | npu_latency | cpu_latency | throughput | fused_weights | training_logs |
140
+ |-------|----------------|--------|-------|-------------|-------------|------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
141
+ | T | 76.6* \| 75.1 | 12.1M | 0.3 | 1.8ms | 109ms | 13878 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_t_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_t_without_distill_300e_fused.pt) | [dist](https://raw.githubusercontent.com/suous/RecNeXt/main/lsnet/logs/distill/recnext_t_distill_300e.txt) \| [base](https://raw.githubusercontent.com/suous/RecNeXt/main/lsnet/logs/normal/recnext_t_without_distill_300e.txt) |
142
+ | S | 79.6* \| 78.3 | 15.8M | 0.7 | 2.0ms | 188ms | 7989 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_s_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_s_without_distill_300e_fused.pt) | [dist](https://raw.githubusercontent.com/suous/RecNeXt/main/lsnet/logs/distill/recnext_s_distill_300e.txt) \| [base](https://raw.githubusercontent.com/suous/RecNeXt/main/lsnet/logs/normal/recnext_s_without_distill_300e.txt) |
143
+ | B | 81.4* \| 80.3 | 19.3M | 1.1 | 2.5ms | 290ms | 4450 | [dist](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_b_distill_300e_fused.pt) \| [base](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_b_without_distill_300e_fused.pt) | [dist](https://raw.githubusercontent.com/suous/RecNeXt/main/lsnet/logs/distill/recnext_b_distill_300e.txt) \| [base](https://raw.githubusercontent.com/suous/RecNeXt/main/lsnet/logs/normal/recnext_b_without_distill_300e.txt) |
144
 
145
  > The NPU latency is measured on an iPhone 13 with models compiled by Core ML Tools.
146
  > The CPU latency is accessed on a Quad-core ARM Cortex-A57 processor in ONNX format.
147
  > And the throughput is tested on an Nvidia RTX3090 with maximum power-of-two batch size that fits in memory.
148
 
149
 
150
+ ## Latency Measurement
151
 
152
  The latency reported in RecNeXt for iPhone 13 (iOS 18) uses the benchmark tool from [XCode 14](https://developer.apple.com/videos/play/wwdc2022/10027/).
153
 
 
235
  <img src="https://raw.githubusercontent.com/suous/RecNeXt/main/figures/latency/recnext_a5_224x224.png" alt="recnext_a5">
236
  </details>
237
 
238
+ <details>
239
+ <summary>
240
+ RecNeXt-T
241
+ </summary>
242
+ <img src="https://raw.githubusercontent.com/suous/RecNeXt/main/lsnet/figures/latency/recnext_t_224x224.png" alt="recnext_t">
243
+ </details>
244
+
245
+ <details>
246
+ <summary>
247
+ RecNeXt-S
248
+ </summary>
249
+ <img src="https://raw.githubusercontent.com/suous/RecNeXt/main/lsnet/figures/latency/recnext_s_224x224.png" alt="recnext_s">
250
+ </details>
251
+
252
+ <details>
253
+ <summary>
254
+ RecNeXt-B
255
+ </summary>
256
+ <img src="https://raw.githubusercontent.com/suous/RecNeXt/main/lsnet/figures/latency/recnext_b_224x224.png" alt="recnext_b">
257
+ </details>
258
+
259
  Tips: export the model to Core ML model
260
  ```
261
  python export_coreml.py --model recnext_m1 --ckpt pretrain/recnext_m1_distill_300e.pth
 
268
  ## ImageNet (Training and Evaluation)
269
 
270
  ### Prerequisites
271
+ `conda` virtual environment is recommended.
272
  ```
273
  conda create -n recnext python=3.8
274
  pip install -r requirements.txt
 
307
  ```
308
  python -m torch.distributed.launch --nproc_per_node=8 --master_port 12346 --use_env main.py --model recnext_m1 --data-path ~/imagenet --dist-eval
309
  ```
310
+ Tips: specify your data path and model name!
311
 
312
+ ### Testing
313
  For example, to test RecNeXt-M1:
314
  ```
315
  python main.py --eval --model recnext_m1 --resume pretrain/recnext_m1_distill_300e.pth --data-path ~/imagenet
 
345
  ```
346
 
347
  ## Downstream Tasks
348
+ [Object Detection and Instance Segmentation](https://github.com/suous/RecNeXt/blob/main/detection/README.md)<br>
349
+
350
+ | model | $AP^b$ | $AP_{50}^b$ | $AP_{75}^b$ | $AP^m$ | $AP_{50}^m$ | $AP_{75}^m$ | Latency | Ckpt | Log |
351
+ |:------|:------:|:-----------:|:-----------:|:------:|:-----------:|:-----------:|:-------:|:---------------------------------------------------------------------------------:|:----------------------------------------------------------------------------------------------:|
352
+ | M3 | 41.7 | 63.4 | 45.4 | 38.6 | 60.5 | 41.4 | 5.2ms | [M3](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m3_coco.pth) | [M3](https://raw.githubusercontent.com/suous/RecNeXt/main/detection/logs/recnext_m3_coco.json) |
353
+ | M4 | 43.5 | 64.9 | 47.7 | 39.7 | 62.1 | 42.4 | 7.6ms | [M4](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m4_coco.pth) | [M4](https://raw.githubusercontent.com/suous/RecNeXt/main/detection/logs/recnext_m4_coco.json) |
354
+ | M5 | 44.6 | 66.3 | 49.0 | 40.6 | 63.5 | 43.5 | 12.4ms | [M5](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m5_coco.pth) | [M5](https://raw.githubusercontent.com/suous/RecNeXt/main/detection/logs/recnext_m5_coco.json) |
355
+ | A3 | 42.1 | 64.1 | 46.2 | 38.8 | 61.1 | 41.6 | 8.3ms | [A3](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a3_coco.pth) | [A3](https://raw.githubusercontent.com/suous/RecNeXt/main/detection/logs/recnext_a3_coco.json) |
356
+ | A4 | 43.5 | 65.4 | 47.6 | 39.8 | 62.4 | 42.9 | 14.0ms | [A4](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a4_coco.pth) | [A4](https://raw.githubusercontent.com/suous/RecNeXt/main/detection/logs/recnext_a4_coco.json) |
357
+ | A5 | 44.4 | 66.3 | 48.9 | 40.3 | 63.3 | 43.4 | 25.3ms | [A5](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a5_coco.pth) | [A5](https://raw.githubusercontent.com/suous/RecNeXt/main/detection/logs/recnext_a5_coco.json) |
358
+
359
+ [Semantic Segmentation](https://github.com/suous/RecNeXt/blob/main/segmentation/README.md)
360
+
361
+ | Model | mIoU | Latency | Ckpt | Log |
362
+ |:-----------|:----:|:-------:|:-----------------------------------------------------------------------------------:|:---------------------------------------------------------------------------------------------------:|
363
+ | RecNeXt-M3 | 41.0 | 5.6ms | [M3](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m3_ade20k.pth) | [M3](https://raw.githubusercontent.com/suous/RecNeXt/main/segmentation/logs/recnext_m3_ade20k.json) |
364
+ | RecNeXt-M4 | 43.6 | 7.2ms | [M4](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m4_ade20k.pth) | [M4](https://raw.githubusercontent.com/suous/RecNeXt/main/segmentation/logs/recnext_m4_ade20k.json) |
365
+ | RecNeXt-M5 | 46.0 | 12.4ms | [M5](https://github.com/suous/RecNeXt/releases/download/v1.0/recnext_m5_ade20k.pth) | [M5](https://raw.githubusercontent.com/suous/RecNeXt/main/segmentation/logs/recnext_m5_ade20k.json) |
366
+ | RecNeXt-A3 | 41.9 | 8.4ms | [A3](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a3_ade20k.pth) | [A3](https://raw.githubusercontent.com/suous/RecNeXt/main/segmentation/logs/recnext_a3_ade20k.json) |
367
+ | RecNeXt-A4 | 43.0 | 14.0ms | [A4](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a4_ade20k.pth) | [A4](https://raw.githubusercontent.com/suous/RecNeXt/main/segmentation/logs/recnext_a4_ade20k.json) |
368
+ | RecNeXt-A5 | 46.5 | 25.3ms | [A5](https://github.com/suous/RecNeXt/releases/download/v2.0/recnext_a5_ade20k.pth) | [A5](https://raw.githubusercontent.com/suous/RecNeXt/main/segmentation/logs/recnext_a5_ade20k.json) |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
369
 
370
  ## Ablation Study
371
 
 
403
  β”œβ”€β”€ <a style="text-decoration:none" href="https://raw.githubusercontent.com/suous/RecNeXt/main/logs/ablation/384/recnext_m1_120e_384x384_rec_convtrans_7x7_group_7791.txt">recnext_m1_120e_384x384_rec_convtrans_7x7_group_7791.txt</a>
404
  └── <a style="text-decoration:none" href="https://raw.githubusercontent.com/suous/RecNeXt/main/logs/ablation/384/recnext_m1_120e_384x384_rec_convtrans_7x7_split_7683.txt">recnext_m1_120e_384x384_rec_convtrans_7x7_split_7683.txt</a>
405
  </pre>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
406
  </details>
407
 
408
  <details>
 
425
  'bias': bias
426
  }
427
  self.n = nn.Conv2d(stride=2, **kwargs)
428
+ self.a = nn.Conv2d(**kwargs) if level >1 else None
429
+ self.b = nn.Conv2d(**kwargs)
430
  self.c = nn.Conv2d(**kwargs)
431
  self.d = nn.Conv2d(**kwargs)
432
 
 
550
 
551
  ### RecConv Beyond
552
 
553
+ We apply RecConv to [MLLA](https://github.com/LeapLabTHU/MLLA) small variants, replacing linear attention and downsampling layers.
554
  Result in higher throughput and less training memory usage.
555
 
556
+ <details>
557
+ <summary>
558
+ <span style="font-size: larger; ">Ablation Logs</span>
559
+ </summary>
560
+
561
  <pre>
562
  mlla/logs
563
  β”œβ”€β”€ 1_mlla_nano
 
573
  β”œβ”€β”€ <a style="text-decoration:none" href="https://raw.githubusercontent.com/suous/RecNeXt/main/mlla/logs/2_mlla_mini/04_recattn_nearest_interp.txt">04_recattn_nearest_interp.txt</a>
574
  └── <a style="text-decoration:none" href="https://raw.githubusercontent.com/suous/RecNeXt/main/mlla/logs/2_mlla_mini/05_recattn_nearest_interp_simplify.txt">05_recattn_nearest_interp_simplify.txt</a>
575
  </pre>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
576
  </details>
577
 
578
  ## Limitations
 
583
 
584
  ## Acknowledgement
585
 
586
+ Classification (ImageNet) code base is partly built with [LeViT](https://github.com/facebookresearch/LeViT), [PoolFormer](https://github.com/sail-sg/poolformer), [EfficientFormer](https://github.com/snap-research/EfficientFormer), [RepViT](https://github.com/THU-MIG/RepViT), [LSNet](https://github.com/jameslahm/lsnet), [MLLA](https://github.com/LeapLabTHU/MLLA), and [MogaNet](https://github.com/Westlake-AI/MogaNet).
587
 
588
+ The detection and segmentation pipeline is from [MMCV](https://github.com/open-mmlab/mmcv) ([MMDetection](https://github.com/open-mmlab/mmdetection) and [MMSegmentation](https://github.com/open-mmlab/mmsegmentation)).
589
 
590
+ Thanks for the great implementations!
591
 
592
  ## Citation
593
 
 
 
594
  ```BibTeX
595
  @misc{zhao2024recnext,
596
  title={RecConv: Efficient Recursive Convolutions for Multi-Frequency Representations},