File size: 3,298 Bytes

8d4f706

---
library_name: pytorch
tags:
- computer-vision
- pose-estimation
- panoramic-images
- covispose
- pytorch-model
license: mit
base_model: resnet50
---

# CovisPose Model

This model estimates relative poses between panoramic images using the CovisPose framework.

## Model Details

- **Architecture**: CovisPose with resnet50 backbone
- **Transformer Layers**: 6
- **FFN Dimension**: 2048
- **Input Size**: [512, 1024]
- **Parameters**: 121,890,467 (estimated)

## Training Information

### Configuration
- **Epochs**: N/A
- **Batch Size**: N/A
- **Learning Rate**: N/A
- **Backbone**: N/A

### Performance Metrics
- **Final Training Loss**: N/A
- **Training Rotation Error**: N/A
- **Final Validation Loss**: N/A
- **Validation Rotation Error**: N/A

## Usage

```python
import torch
import json
from huggingface_hub import hf_hub_download

# Download model files
model_path = hf_hub_download(
    repo_id="SGEthan/covis_toy",
    filename="pytorch_model.bin"
)

config_path = hf_hub_download(
    repo_id="SGEthan/covis_toy",
    filename="config.json"
)

# Load configuration
with open(config_path, 'r') as f:
    config = json.load(f)

# Initialize model (you'll need the COVIS class)
from models.covispose_model import COVIS

model = COVIS(
    backbone=config['backbone'],
    num_transformer_layers=config['num_transformer_layers'],
    transformer_ffn_dim=config['transformer_ffn_dim']
)

# Load weights
checkpoint = torch.load(model_path, map_location='cpu')
if "model_state_dict" in checkpoint:
    model.load_state_dict(checkpoint["model_state_dict"])
else:
    model.load_state_dict(checkpoint)

model.eval()

# Use for inference
with torch.no_grad():
    # Your inference code here
    # outputs1, outputs2 = model(pano1_tensor, pano2_tensor)
    pass
```

## Model Architecture

The CovisPose model consists of:

1. **Backbone Network**: resnet50 for feature extraction
2. **Transformer Encoder**: 6 layers for processing image features
3. **Prediction Heads**:
   - Covisibility mask prediction
   - Relative pose estimation
   - Boundary detection

## Task Description

**CovisPose** estimates the relative pose between two panoramic images by:

1. **Covisibility Estimation**: Predicting which parts of the images overlap
2. **Pose Regression**: Estimating relative rotation and translation
3. **Boundary Detection**: Finding floor-wall boundaries for scale estimation

## Training Data

This model was trained on panoramic image pairs with:
- Relative pose annotations
- Covisibility masks
- Floor-wall boundary labels

## Limitations

- Designed specifically for indoor panoramic images
- Requires significant visual overlap between image pairs for reliable pose estimation
- Performance may degrade on outdoor scenes or images with minimal overlap

## Citation

If you use this model, please cite the CovisPose work:

```bibtex
@article{covispose2024,
  title={CovisPose: Co-visibility Pose Estimation for Panoramic Images},
  author={Your Authors},
  journal={Conference/Journal},
  year={2024}
}
```

## License

This model is released under the MIT License.

## Repository

- **Training Code**: Available in the original repository
- **Model Upload**: Generated automatically from local checkpoint

---

*Model uploaded on 2025-06-27T09:05:36.254713 using upload_model.py*