--- library_name: pytorch tags: - computer-vision - pose-estimation - panoramic-images - covispose - pytorch-model license: mit base_model: resnet50 --- # CovisPose Model This model estimates relative poses between panoramic images using the CovisPose framework. ## Model Details - **Architecture**: CovisPose with resnet50 backbone - **Transformer Layers**: 6 - **FFN Dimension**: 2048 - **Input Size**: [512, 1024] - **Parameters**: 121,890,467 (estimated) ## Training Information ### Configuration - **Epochs**: N/A - **Batch Size**: N/A - **Learning Rate**: N/A - **Backbone**: N/A ### Performance Metrics - **Final Training Loss**: N/A - **Training Rotation Error**: N/A - **Final Validation Loss**: N/A - **Validation Rotation Error**: N/A ## Usage ```python import torch import json from huggingface_hub import hf_hub_download # Download model files model_path = hf_hub_download( repo_id="SGEthan/covis_toy", filename="pytorch_model.bin" ) config_path = hf_hub_download( repo_id="SGEthan/covis_toy", filename="config.json" ) # Load configuration with open(config_path, 'r') as f: config = json.load(f) # Initialize model (you'll need the COVIS class) from models.covispose_model import COVIS model = COVIS( backbone=config['backbone'], num_transformer_layers=config['num_transformer_layers'], transformer_ffn_dim=config['transformer_ffn_dim'] ) # Load weights checkpoint = torch.load(model_path, map_location='cpu') if "model_state_dict" in checkpoint: model.load_state_dict(checkpoint["model_state_dict"]) else: model.load_state_dict(checkpoint) model.eval() # Use for inference with torch.no_grad(): # Your inference code here # outputs1, outputs2 = model(pano1_tensor, pano2_tensor) pass ``` ## Model Architecture The CovisPose model consists of: 1. **Backbone Network**: resnet50 for feature extraction 2. **Transformer Encoder**: 6 layers for processing image features 3. **Prediction Heads**: - Covisibility mask prediction - Relative pose estimation - Boundary detection ## Task Description **CovisPose** estimates the relative pose between two panoramic images by: 1. **Covisibility Estimation**: Predicting which parts of the images overlap 2. **Pose Regression**: Estimating relative rotation and translation 3. **Boundary Detection**: Finding floor-wall boundaries for scale estimation ## Training Data This model was trained on panoramic image pairs with: - Relative pose annotations - Covisibility masks - Floor-wall boundary labels ## Limitations - Designed specifically for indoor panoramic images - Requires significant visual overlap between image pairs for reliable pose estimation - Performance may degrade on outdoor scenes or images with minimal overlap ## Citation If you use this model, please cite the CovisPose work: ```bibtex @article{covispose2024, title={CovisPose: Co-visibility Pose Estimation for Panoramic Images}, author={Your Authors}, journal={Conference/Journal}, year={2024} } ``` ## License This model is released under the MIT License. ## Repository - **Training Code**: Available in the original repository - **Model Upload**: Generated automatically from local checkpoint --- *Model uploaded on 2025-06-27T09:05:36.254713 using upload_model.py*