|
--- |
|
library_name: pytorch |
|
tags: |
|
- computer-vision |
|
- pose-estimation |
|
- panoramic-images |
|
- covispose |
|
- pytorch-model |
|
license: mit |
|
base_model: resnet50 |
|
--- |
|
|
|
# CovisPose Model |
|
|
|
This model estimates relative poses between panoramic images using the CovisPose framework. |
|
|
|
## Model Details |
|
|
|
- **Architecture**: CovisPose with resnet50 backbone |
|
- **Transformer Layers**: 6 |
|
- **FFN Dimension**: 2048 |
|
- **Input Size**: [512, 1024] |
|
- **Parameters**: 121,890,467 (estimated) |
|
|
|
## Training Information |
|
|
|
### Configuration |
|
- **Epochs**: N/A |
|
- **Batch Size**: N/A |
|
- **Learning Rate**: N/A |
|
- **Backbone**: N/A |
|
|
|
### Performance Metrics |
|
- **Final Training Loss**: N/A |
|
- **Training Rotation Error**: N/A |
|
- **Final Validation Loss**: N/A |
|
- **Validation Rotation Error**: N/A |
|
|
|
## Usage |
|
|
|
```python |
|
import torch |
|
import json |
|
from huggingface_hub import hf_hub_download |
|
|
|
# Download model files |
|
model_path = hf_hub_download( |
|
repo_id="SGEthan/covis_toy", |
|
filename="pytorch_model.bin" |
|
) |
|
|
|
config_path = hf_hub_download( |
|
repo_id="SGEthan/covis_toy", |
|
filename="config.json" |
|
) |
|
|
|
# Load configuration |
|
with open(config_path, 'r') as f: |
|
config = json.load(f) |
|
|
|
# Initialize model (you'll need the COVIS class) |
|
from models.covispose_model import COVIS |
|
|
|
model = COVIS( |
|
backbone=config['backbone'], |
|
num_transformer_layers=config['num_transformer_layers'], |
|
transformer_ffn_dim=config['transformer_ffn_dim'] |
|
) |
|
|
|
# Load weights |
|
checkpoint = torch.load(model_path, map_location='cpu') |
|
if "model_state_dict" in checkpoint: |
|
model.load_state_dict(checkpoint["model_state_dict"]) |
|
else: |
|
model.load_state_dict(checkpoint) |
|
|
|
model.eval() |
|
|
|
# Use for inference |
|
with torch.no_grad(): |
|
# Your inference code here |
|
# outputs1, outputs2 = model(pano1_tensor, pano2_tensor) |
|
pass |
|
``` |
|
|
|
## Model Architecture |
|
|
|
The CovisPose model consists of: |
|
|
|
1. **Backbone Network**: resnet50 for feature extraction |
|
2. **Transformer Encoder**: 6 layers for processing image features |
|
3. **Prediction Heads**: |
|
- Covisibility mask prediction |
|
- Relative pose estimation |
|
- Boundary detection |
|
|
|
## Task Description |
|
|
|
**CovisPose** estimates the relative pose between two panoramic images by: |
|
|
|
1. **Covisibility Estimation**: Predicting which parts of the images overlap |
|
2. **Pose Regression**: Estimating relative rotation and translation |
|
3. **Boundary Detection**: Finding floor-wall boundaries for scale estimation |
|
|
|
## Training Data |
|
|
|
This model was trained on panoramic image pairs with: |
|
- Relative pose annotations |
|
- Covisibility masks |
|
- Floor-wall boundary labels |
|
|
|
## Limitations |
|
|
|
- Designed specifically for indoor panoramic images |
|
- Requires significant visual overlap between image pairs for reliable pose estimation |
|
- Performance may degrade on outdoor scenes or images with minimal overlap |
|
|
|
## Citation |
|
|
|
If you use this model, please cite the CovisPose work: |
|
|
|
```bibtex |
|
@article{covispose2024, |
|
title={CovisPose: Co-visibility Pose Estimation for Panoramic Images}, |
|
author={Your Authors}, |
|
journal={Conference/Journal}, |
|
year={2024} |
|
} |
|
``` |
|
|
|
## License |
|
|
|
This model is released under the MIT License. |
|
|
|
## Repository |
|
|
|
- **Training Code**: Available in the original repository |
|
- **Model Upload**: Generated automatically from local checkpoint |
|
|
|
--- |
|
|
|
*Model uploaded on 2025-06-27T09:05:36.254713 using upload_model.py* |
|
|