File size: 4,325 Bytes
851cd21 0dbfb82 851cd21 0dbfb82 851cd21 0dbfb82 851cd21 603cbe7 851cd21 603cbe7 851cd21 603cbe7 851cd21 0dbfb82 603cbe7 851cd21 0dbfb82 851cd21 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 |
---
license: cc-by-4.0
library_name: pytorch
tags:
- computer-vision
- object-tracking
- spiking-neural-networks
- visual-streaming-perception
- energy-efficient
- cvpr-2025
pipeline_tag: object-detection
---
# ViStream: Law-of-Charge-Conservation Inspired Spiking Neural Network for Visual Streaming Perception
**ViStream** is a novel energy-efficient framework for Visual Streaming Perception (VSP) that leverages Spiking Neural Networks (SNNs) with Law of Charge Conservation (LoCC) properties.
## Model Details
### Model Description
- **Developed by:** Kang You, Ziling Wei, Jing Yan, Boning Zhang, Qinghai Guo, Yaoyu Zhang, Zhezhi He
- **Model type:** Spiking Neural Network for Visual Streaming Perception
- **Language(s):** PyTorch implementation
- **License:** CC-BY-4.0
- **Paper:** [CVPR 2025](https://openaccess.thecvf.com/content/CVPR2025/papers/You_VISTREAM_Improving_Computation_Efficiency_of_Visual_Streaming_Perception_via_Law-of-Charge-Conservation_CVPR_2025_paper.pdf)
- **Repository:** [GitHub](https://github.com/Intelligent-Computing-Research-Group/ViStream)
### Model Architecture
ViStream introduces two key innovations:
1. **Law of Charge Conservation (LoCC)** property in ST-BIF neurons
2. **Differential Encoding (DiffEncode)** scheme for temporal optimization
The framework achieves significant computational reduction while maintaining accuracy equivalent to ANN counterparts.
## Uses
### Direct Use
ViStream can be directly used for:
- **Multiple Object Tracking (MOT)**
- **Single Object Tracking (SOT)**
- **Video Object Segmentation (VOS)**
- **Multiple Object Tracking and Segmentation (MOTS)**
- **Pose Tracking**
### Downstream Use
The model can be fine-tuned for various visual streaming perception tasks in:
- Autonomous driving
- UAV navigation
- AR/VR applications
- Real-time surveillance
## Bias, Risks, and Limitations
### Limitations
- Requires specific hardware optimization for maximum energy benefits
- Performance may vary with different frame rates
- Limited to visual perception tasks
### Recommendations
- Test thoroughly on target hardware before deployment
- Consider computational constraints of edge devices
- Validate performance on domain-specific datasets
## How to Get Started with the Model
```python
from huggingface_hub import hf_hub_download
import torch
# Download the checkpoint
checkpoint_path = hf_hub_download(
repo_id="AndyBlocker/ViStream",
filename="checkpoint-90.pth"
)
# Load the model (requires ViStream implementation)
checkpoint = torch.load(checkpoint_path, map_location='cpu')
```
For complete usage examples, see the [GitHub repository](https://github.com/Intelligent-Computing-Research-Group/ViStream).
## Training Details
### Training Data
The model was trained on multiple datasets for various visual streaming perception tasks including object tracking, video object segmentation, and pose tracking.
### Training Procedure
**Training Details:**
- Framework: PyTorch
- Optimization: Energy-efficient SNN training with Law of Charge Conservation
- Architecture: ResNet-based backbone with spike quantization layers
## Evaluation
The model demonstrates competitive performance across multiple visual streaming perception tasks while achieving significant energy efficiency improvements compared to traditional ANN-based approaches. Detailed evaluation results are available in the [CVPR 2025 paper](https://openaccess.thecvf.com/content/CVPR2025/papers/You_VISTREAM_Improving_Computation_Efficiency_of_Visual_Streaming_Perception_via_Law-of-Charge-Conservation_CVPR_2025_paper.pdf).
## Model Card Authors
Kang You, Ziling Wei, Jing Yan, Boning Zhang, Qinghai Guo, Yaoyu Zhang, Zhezhi He
## Model Card Contact
For questions about this model, please open an issue in the [GitHub repository](https://github.com/Intelligent-Computing-Research-Group/ViStream).
## Citation
```bibtex
@inproceedings{you2025vistream,
title={VISTREAM: Improving Computation Efficiency of Visual Streaming Perception via Law-of-Charge-Conservation Inspired Spiking Neural Network},
author={You, Kang and Wei, Ziling and Yan, Jing and Zhang, Boning and Guo, Qinghai and Zhang, Yaoyu and He, Zhezhi},
booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
pages={8796--8805},
year={2025}
}
``` |