StreamFormer commited on
Commit
6972901
·
verified ·
1 Parent(s): 2188a2d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +49 -3
README.md CHANGED
@@ -1,3 +1,49 @@
1
- ---
2
- license: cc-by-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: "cc-by-nc-4.0"
3
+ tags:
4
+ - vision
5
+ - video-classification
6
+ ---
7
+
8
+ # StreamFormer (base-sized model)
9
+
10
+ StreamFormer backbone model pre-trained on *Global*-, *Temporal*- and *Spatial*- granularities. It was introduced in the paper [Learning Streaming Video Representation via Multitask Training](https://arxiv.org/abs/2504.20041) and first released in [this repository](https://github.com/Go2Heart/StreamFormer).
11
+
12
+ ## Intended uses & limitations
13
+
14
+ StreamFormer is a streaming video representation backbone that encodes a stream of video input. It is designed for multiple downstream applications like Online Action Detection, Online Video Instance Segmentation and Video Question Answering.
15
+
16
+ ### How to use
17
+
18
+ How to get the multi-granularity feature:
19
+
20
+ ```python
21
+ from models import TimesformerMultiTaskingModelSigLIP
22
+ import torch
23
+ model = TimesformerMultiTaskingModelSigLIP.from_pretrained("StreamFormer/streamformer-timesformer").eval()
24
+ with torch.no_grad():
25
+ fake_frames = torch.randn(1, 16, 3, 224, 224)
26
+ fake_frames = fake_frames.to(model.device)
27
+ output = model(fake_frames)
28
+ # global representation [B, D]
29
+ print(output.pooler_output[:,-1].shape, output.pooler_output[:,-1])
30
+
31
+ # temporal representation [B, T, D]
32
+ print(output.pooler_output.shape, output.pooler_output)
33
+
34
+ # spatial representation [B, T, HxW, D]
35
+ print(output.last_hidden_state.shape, output.last_hidden_state)
36
+ ```
37
+
38
+ ### BibTeX entry and citation info
39
+
40
+ ```bibtex
41
+ @misc{yan2025learning,
42
+ title={Learning Streaming Video Representation via Multitask Training},
43
+ author={Yibin Yan and Jilan Xu and Shangzhe Di and Yikun Liu and Yudi Shi and Qirui Chen and Zeqian Li and Yifei Huang and Weidi Xie},
44
+ year={2025},
45
+ eprint={2504.20041},
46
+ archivePrefix={arXiv},
47
+ primaryClass={cs.CV}
48
+ }
49
+ ```