Upload CovisPose model from local checkpoint

Browse files

Files changed (5) hide show

README.md +142 -0
config.json +15 -0
pytorch_model.bin +3 -0
requirements.txt +5 -0
upload_metadata.json +48 -0

README.md ADDED Viewed

	@@ -0,0 +1,142 @@

+---
+library_name: pytorch
+tags:
+- computer-vision
+- pose-estimation
+- panoramic-images
+- covispose
+- pytorch-model
+license: mit
+base_model: resnet50
+---
+# CovisPose Model
+This model estimates relative poses between panoramic images using the CovisPose framework.
+## Model Details
+- **Architecture**: CovisPose with resnet50 backbone
+- **Transformer Layers**: 6
+- **FFN Dimension**: 2048
+- **Input Size**: [512, 1024]
+- **Parameters**: 121,890,467 (estimated)
+## Training Information
+### Configuration
+- **Epochs**: N/A
+- **Batch Size**: N/A
+- **Learning Rate**: N/A
+- **Backbone**: N/A
+### Performance Metrics
+- **Final Training Loss**: N/A
+- **Training Rotation Error**: N/A
+- **Final Validation Loss**: N/A
+- **Validation Rotation Error**: N/A
+## Usage
+```python
+import torch
+import json
+from huggingface_hub import hf_hub_download
+# Download model files
+model_path = hf_hub_download(
+    repo_id="SGEthan/covis_toy",
+    filename="pytorch_model.bin"
+)
+config_path = hf_hub_download(
+    repo_id="SGEthan/covis_toy",
+    filename="config.json"
+)
+# Load configuration
+with open(config_path, 'r') as f:
+    config = json.load(f)
+# Initialize model (you'll need the COVIS class)
+from models.covispose_model import COVIS
+model = COVIS(
+    backbone=config['backbone'],
+    num_transformer_layers=config['num_transformer_layers'],
+    transformer_ffn_dim=config['transformer_ffn_dim']
+)
+# Load weights
+checkpoint = torch.load(model_path, map_location='cpu')
+if "model_state_dict" in checkpoint:
+    model.load_state_dict(checkpoint["model_state_dict"])
+else:
+    model.load_state_dict(checkpoint)
+model.eval()
+# Use for inference
+with torch.no_grad():
+    # Your inference code here
+    # outputs1, outputs2 = model(pano1_tensor, pano2_tensor)
+    pass
+```
+## Model Architecture
+The CovisPose model consists of:
+1. **Backbone Network**: resnet50 for feature extraction
+2. **Transformer Encoder**: 6 layers for processing image features
+3. **Prediction Heads**:
+   - Covisibility mask prediction
+   - Relative pose estimation
+   - Boundary detection
+## Task Description
+**CovisPose** estimates the relative pose between two panoramic images by:
+1. **Covisibility Estimation**: Predicting which parts of the images overlap
+2. **Pose Regression**: Estimating relative rotation and translation
+3. **Boundary Detection**: Finding floor-wall boundaries for scale estimation
+## Training Data
+This model was trained on panoramic image pairs with:
+- Relative pose annotations
+- Covisibility masks
+- Floor-wall boundary labels
+## Limitations
+- Designed specifically for indoor panoramic images
+- Requires significant visual overlap between image pairs for reliable pose estimation
+- Performance may degrade on outdoor scenes or images with minimal overlap
+## Citation
+If you use this model, please cite the CovisPose work:
+```bibtex
+@article{covispose2024,
+  title={CovisPose: Co-visibility Pose Estimation for Panoramic Images},
+  author={Your Authors},
+  journal={Conference/Journal},
+  year={2024}
+}
+```
+## License
+This model is released under the MIT License.
+## Repository
+- **Training Code**: Available in the original repository
+- **Model Upload**: Generated automatically from local checkpoint
+---
+*Model uploaded on 2025-06-27T09:05:36.254713 using upload_model.py*

config.json ADDED Viewed

	@@ -0,0 +1,15 @@

+{
+  "architecture": "CovisPose",
+  "framework": "pytorch",
+  "model_type": "pose_estimation",
+  "task": "panoramic_pose_estimation",
+  "input_size": [
+    512,
+    1024
+  ],
+  "backbone": "resnet50",
+  "num_transformer_layers": 6,
+  "transformer_ffn_dim": 2048,
+  "num_classes": null,
+  "library_name": "pytorch"
+}

pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3b5ed1a2abbd8701fc9fab0d5d6dfbb9e2835f60ea17c25778b2137566e4b62b
+size 1460570852

requirements.txt ADDED Viewed

	@@ -0,0 +1,5 @@

+torch>=1.9.0
+torchvision>=0.10.0
+numpy>=1.19.0
+pillow>=8.0.0
+huggingface_hub>=0.12.0

upload_metadata.json ADDED Viewed

	@@ -0,0 +1,48 @@

+{
+  "original_checkpoint": "outputs/hf_training/checkpoints/final_model.pth",
+  "upload_date": "2025-06-27T09:05:36.254713",
+  "analysis": {
+    "checkpoint_keys": [
+      "epoch",
+      "model_state_dict",
+      "optimizer_state_dict",
+      "loss"
+    ],
+    "has_model_state_dict": true,
+    "has_optimizer_state_dict": true,
+    "has_training_metrics": false,
+    "has_val_metrics": false,
+    "has_args": false,
+    "training_info": {},
+    "model_info": {
+      "num_parameters": 121890467,
+      "parameter_keys": [
+        "covis_feature_extractor.feature_extractor.encoder.conv1.1.weight",
+        "covis_feature_extractor.feature_extractor.encoder.bn1.weight",
+        "covis_feature_extractor.feature_extractor.encoder.bn1.bias",
+        "covis_feature_extractor.feature_extractor.encoder.bn1.running_mean",
+        "covis_feature_extractor.feature_extractor.encoder.bn1.running_var",
+        "covis_feature_extractor.feature_extractor.encoder.bn1.num_batches_tracked",
+        "covis_feature_extractor.feature_extractor.encoder.layer1.0.conv1.weight",
+        "covis_feature_extractor.feature_extractor.encoder.layer1.0.bn1.weight",
+        "covis_feature_extractor.feature_extractor.encoder.layer1.0.bn1.bias",
+        "covis_feature_extractor.feature_extractor.encoder.layer1.0.bn1.running_mean"
+      ]
+    }
+  },
+  "config": {
+    "architecture": "CovisPose",
+    "framework": "pytorch",
+    "model_type": "pose_estimation",
+    "task": "panoramic_pose_estimation",
+    "input_size": [
+      512,
+      1024
+    ],
+    "backbone": "resnet50",
+    "num_transformer_layers": 6,
+    "transformer_ffn_dim": 2048,
+    "num_classes": null,
+    "library_name": "pytorch"
+  }
+}