Video Classification
File size: 2,650 Bytes
417cc1b
 
 
 
9e1323e
417cc1b
 
 
 
 
 
 
 
90b00e5
417cc1b
23ed486
417cc1b
 
 
23ed486
 
 
 
417cc1b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
---
license: cc-by-4.0
datasets:
- LEI-QI-233/MicroG-4M
- LEI-QI-233/MicroG-HAR-train-ready
metrics:
- mAP
- F1-score
- Recall
- AUROC
pipeline_tag: video-classification
---

# Here stores all fine-tuned weights of our dataset.

## Please view our paper, GitHub and dataset firstly:


<div align="left">
   <a href="https://arxiv.org/abs/2506.02845"
     style="display: inline-block; margin: 0 4px;">
    <img src="https://img.shields.io/badge/arXiv-Paper-b31b1b?logo=arxiv" alt="arXiv Paper"/>
  </a>
    <a href="https://github.com/LEI-QI-233/HAR-in-Space"
     style="display: inline-block; margin: 0 4px;">
    <img src="https://img.shields.io/badge/GitHub-GitHub Repo-white?logo=github"
         alt="GitHub"/>
  </a>
  <a href="https://huggingface.co/datasets/LEI-QI-233/MicroG-4M"
     style="display: inline-block; margin: 0 4px;">
    <img src="https://img.shields.io/badge/Hugging%20Face-Dataset-orange?logo=huggingface"
         alt="Hugging Face Dataset"/>
  </a>
</div>

---

### Performance comparison of models fine-tuned on MicroG-4M for HAR

| Arch     | TC   | Backbone | #Params (M) | mAP (%) | F1-score (%) | Recall (%) | AUROC (%) |
| -------- | ---- | -------- | ----------- | ------- | ------------ | ---------- | --------- |
| C2D      | 8×8  | R50      | 23.61       | 29.51   | 8.09         | 6.58       | 83.49     |
| C2D NLN  | 8×8  | R50      | 30.97       | 44.64   | 28.30        | 24.86      | 89.40     |
| I3D      | 8×8  | R50      | 27.33       | 46.41   | 26.37        | 22.25      | 88.79     |
| I3D NLN  | 8×8  | R50      | 34.68       | 47.12   | 28.07        | 24.65      | 88.52     |
| Slow     | 8×8  | R50      | 31.74       | 45.19   | 26.13        | 22.77      | 88.49     |
| Slow     | 4×16 | R50      | 31.74       | 46.37   | 28.72        | 25.38      | 88.30     |
| SlowFast | 8×8  | R50      | 33.76       | 43.02   | 22.63        | 18.98      | 88.51     |
| SlowFast | 4×16 | R50      | 33.76       | 42.10   | 23.69        | 20.18      | 87.54     |
| MViTv1   | 16×4 | B-CONV   | 36.34       | 12.86   | 5.54         | 4.66       | 74.63     |
| MViTv2   | 16×4 | S        | 34.27       | 15.14   | 8.16         | 7.17       | 78.61     |
| X3D      | 13×6 | S        | 2.02        | 14.07   | 5.77         | 4.52       | 78.23     |
| X3D      | 16×5 | L        | 4.37        | 18.70   | 9.15         | 7.47       | 78.27     |


**Note:** 
- All models has been pretrained on Kinetics400 dataset and continually trained on MicroG-4M. 
- `TC` denotes the temporal configuration (frame length × sampling rate). 
- `#Params` indicates the number of parameters (in millions, M).