covis_toy / README.md
SGEthan's picture
Upload CovisPose model from local checkpoint
8d4f706 verified
---
library_name: pytorch
tags:
- computer-vision
- pose-estimation
- panoramic-images
- covispose
- pytorch-model
license: mit
base_model: resnet50
---
# CovisPose Model
This model estimates relative poses between panoramic images using the CovisPose framework.
## Model Details
- **Architecture**: CovisPose with resnet50 backbone
- **Transformer Layers**: 6
- **FFN Dimension**: 2048
- **Input Size**: [512, 1024]
- **Parameters**: 121,890,467 (estimated)
## Training Information
### Configuration
- **Epochs**: N/A
- **Batch Size**: N/A
- **Learning Rate**: N/A
- **Backbone**: N/A
### Performance Metrics
- **Final Training Loss**: N/A
- **Training Rotation Error**: N/A
- **Final Validation Loss**: N/A
- **Validation Rotation Error**: N/A
## Usage
```python
import torch
import json
from huggingface_hub import hf_hub_download
# Download model files
model_path = hf_hub_download(
repo_id="SGEthan/covis_toy",
filename="pytorch_model.bin"
)
config_path = hf_hub_download(
repo_id="SGEthan/covis_toy",
filename="config.json"
)
# Load configuration
with open(config_path, 'r') as f:
config = json.load(f)
# Initialize model (you'll need the COVIS class)
from models.covispose_model import COVIS
model = COVIS(
backbone=config['backbone'],
num_transformer_layers=config['num_transformer_layers'],
transformer_ffn_dim=config['transformer_ffn_dim']
)
# Load weights
checkpoint = torch.load(model_path, map_location='cpu')
if "model_state_dict" in checkpoint:
model.load_state_dict(checkpoint["model_state_dict"])
else:
model.load_state_dict(checkpoint)
model.eval()
# Use for inference
with torch.no_grad():
# Your inference code here
# outputs1, outputs2 = model(pano1_tensor, pano2_tensor)
pass
```
## Model Architecture
The CovisPose model consists of:
1. **Backbone Network**: resnet50 for feature extraction
2. **Transformer Encoder**: 6 layers for processing image features
3. **Prediction Heads**:
- Covisibility mask prediction
- Relative pose estimation
- Boundary detection
## Task Description
**CovisPose** estimates the relative pose between two panoramic images by:
1. **Covisibility Estimation**: Predicting which parts of the images overlap
2. **Pose Regression**: Estimating relative rotation and translation
3. **Boundary Detection**: Finding floor-wall boundaries for scale estimation
## Training Data
This model was trained on panoramic image pairs with:
- Relative pose annotations
- Covisibility masks
- Floor-wall boundary labels
## Limitations
- Designed specifically for indoor panoramic images
- Requires significant visual overlap between image pairs for reliable pose estimation
- Performance may degrade on outdoor scenes or images with minimal overlap
## Citation
If you use this model, please cite the CovisPose work:
```bibtex
@article{covispose2024,
title={CovisPose: Co-visibility Pose Estimation for Panoramic Images},
author={Your Authors},
journal={Conference/Journal},
year={2024}
}
```
## License
This model is released under the MIT License.
## Repository
- **Training Code**: Available in the original repository
- **Model Upload**: Generated automatically from local checkpoint
---
*Model uploaded on 2025-06-27T09:05:36.254713 using upload_model.py*