cminst
/

StreamMamba

Video Classification

Model card Files Files and versions

qingy2024 commited on Jul 21

Commit

6afa5b2

·

verified ·

1 Parent(s): 31cb5c5

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -32,12 +32,12 @@ This model is licensed under the <a href="https://www.apache.org/licenses/LICENS
 | Filename                | Size     | Description                                                                 |
 |-------------------------|----------|-----------------------------------------------------------------------------|
 | `cross_mamba_film_warmup.pt` | 504 MB | Cross-modal model combining vision and text using **FiLM** (Feature-wise Linear Modulation) and **Mamba** layers for temporal modeling. |
-| `mamba_mobileclip_ckpt.pt`   | 500 MB | Mamba-based temporal aggregator trained on MobileCLIP embeddings (no FiLM). Checkpoint 6900. |
 | `internvideo2_clip.pt`       | 5.55 MB | CLIP-style vision-language alignment component for InternVideo2-B14. |
 | `internvideo2_vision.pt`     | 205 MB  | Vision encoder backbone (InternVideo2-B14) for video feature extraction. |
 | `mobileclip_blt.pt`          | 599 MB  | Lightweight **MobileCLIP** variant (BLT) for resource-constrained applications. |
-#### Self-Predictive Frame Skipping (SPFS)
 The `spfs_r64` folder contains a self-contained system for adaptive frame skipping in videos. Each checkpoint file includes:
 - MobileCLIP vision/text encoders
 - InternVideo2-B14 vision encoder weights

 | Filename                | Size     | Description                                                                 |
 |-------------------------|----------|-----------------------------------------------------------------------------|
 | `cross_mamba_film_warmup.pt` | 504 MB | Cross-modal model combining vision and text using **FiLM** (Feature-wise Linear Modulation) and **Mamba** layers for temporal modeling. |
+| `mamba_mobileclip_ckpt.pt`   | 500 MB | <span style="position: relative; cursor: help;"><span class="streammamba-glow">StreamMamba</span><span class="glow-ring"></span></span> temporal aggregator trained on MobileCLIP embeddings (no FiLM). Checkpoint 6900. |
 | `internvideo2_clip.pt`       | 5.55 MB | CLIP-style vision-language alignment component for InternVideo2-B14. |
 | `internvideo2_vision.pt`     | 205 MB  | Vision encoder backbone (InternVideo2-B14) for video feature extraction. |
 | `mobileclip_blt.pt`          | 599 MB  | Lightweight **MobileCLIP** variant (BLT) for resource-constrained applications. |
+#### <span style="position: relative; cursor: help;"><span class="streammamba-glow">StreamMamba</span><span class="glow-ring"></span></span> Self-Predictive Frame Skipping (SPFS)
 The `spfs_r64` folder contains a self-contained system for adaptive frame skipping in videos. Each checkpoint file includes:
 - MobileCLIP vision/text encoders
 - InternVideo2-B14 vision encoder weights