File size: 3,298 Bytes
8d4f706 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 |
---
library_name: pytorch
tags:
- computer-vision
- pose-estimation
- panoramic-images
- covispose
- pytorch-model
license: mit
base_model: resnet50
---
# CovisPose Model
This model estimates relative poses between panoramic images using the CovisPose framework.
## Model Details
- **Architecture**: CovisPose with resnet50 backbone
- **Transformer Layers**: 6
- **FFN Dimension**: 2048
- **Input Size**: [512, 1024]
- **Parameters**: 121,890,467 (estimated)
## Training Information
### Configuration
- **Epochs**: N/A
- **Batch Size**: N/A
- **Learning Rate**: N/A
- **Backbone**: N/A
### Performance Metrics
- **Final Training Loss**: N/A
- **Training Rotation Error**: N/A
- **Final Validation Loss**: N/A
- **Validation Rotation Error**: N/A
## Usage
```python
import torch
import json
from huggingface_hub import hf_hub_download
# Download model files
model_path = hf_hub_download(
repo_id="SGEthan/covis_toy",
filename="pytorch_model.bin"
)
config_path = hf_hub_download(
repo_id="SGEthan/covis_toy",
filename="config.json"
)
# Load configuration
with open(config_path, 'r') as f:
config = json.load(f)
# Initialize model (you'll need the COVIS class)
from models.covispose_model import COVIS
model = COVIS(
backbone=config['backbone'],
num_transformer_layers=config['num_transformer_layers'],
transformer_ffn_dim=config['transformer_ffn_dim']
)
# Load weights
checkpoint = torch.load(model_path, map_location='cpu')
if "model_state_dict" in checkpoint:
model.load_state_dict(checkpoint["model_state_dict"])
else:
model.load_state_dict(checkpoint)
model.eval()
# Use for inference
with torch.no_grad():
# Your inference code here
# outputs1, outputs2 = model(pano1_tensor, pano2_tensor)
pass
```
## Model Architecture
The CovisPose model consists of:
1. **Backbone Network**: resnet50 for feature extraction
2. **Transformer Encoder**: 6 layers for processing image features
3. **Prediction Heads**:
- Covisibility mask prediction
- Relative pose estimation
- Boundary detection
## Task Description
**CovisPose** estimates the relative pose between two panoramic images by:
1. **Covisibility Estimation**: Predicting which parts of the images overlap
2. **Pose Regression**: Estimating relative rotation and translation
3. **Boundary Detection**: Finding floor-wall boundaries for scale estimation
## Training Data
This model was trained on panoramic image pairs with:
- Relative pose annotations
- Covisibility masks
- Floor-wall boundary labels
## Limitations
- Designed specifically for indoor panoramic images
- Requires significant visual overlap between image pairs for reliable pose estimation
- Performance may degrade on outdoor scenes or images with minimal overlap
## Citation
If you use this model, please cite the CovisPose work:
```bibtex
@article{covispose2024,
title={CovisPose: Co-visibility Pose Estimation for Panoramic Images},
author={Your Authors},
journal={Conference/Journal},
year={2024}
}
```
## License
This model is released under the MIT License.
## Repository
- **Training Code**: Available in the original repository
- **Model Upload**: Generated automatically from local checkpoint
---
*Model uploaded on 2025-06-27T09:05:36.254713 using upload_model.py*
|