SGEthan commited on
Commit
8d4f706
·
verified ·
1 Parent(s): 7fd90a7

Upload CovisPose model from local checkpoint

Browse files
Files changed (5) hide show
  1. README.md +142 -0
  2. config.json +15 -0
  3. pytorch_model.bin +3 -0
  4. requirements.txt +5 -0
  5. upload_metadata.json +48 -0
README.md ADDED
@@ -0,0 +1,142 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: pytorch
3
+ tags:
4
+ - computer-vision
5
+ - pose-estimation
6
+ - panoramic-images
7
+ - covispose
8
+ - pytorch-model
9
+ license: mit
10
+ base_model: resnet50
11
+ ---
12
+
13
+ # CovisPose Model
14
+
15
+ This model estimates relative poses between panoramic images using the CovisPose framework.
16
+
17
+ ## Model Details
18
+
19
+ - **Architecture**: CovisPose with resnet50 backbone
20
+ - **Transformer Layers**: 6
21
+ - **FFN Dimension**: 2048
22
+ - **Input Size**: [512, 1024]
23
+ - **Parameters**: 121,890,467 (estimated)
24
+
25
+ ## Training Information
26
+
27
+ ### Configuration
28
+ - **Epochs**: N/A
29
+ - **Batch Size**: N/A
30
+ - **Learning Rate**: N/A
31
+ - **Backbone**: N/A
32
+
33
+ ### Performance Metrics
34
+ - **Final Training Loss**: N/A
35
+ - **Training Rotation Error**: N/A
36
+ - **Final Validation Loss**: N/A
37
+ - **Validation Rotation Error**: N/A
38
+
39
+ ## Usage
40
+
41
+ ```python
42
+ import torch
43
+ import json
44
+ from huggingface_hub import hf_hub_download
45
+
46
+ # Download model files
47
+ model_path = hf_hub_download(
48
+ repo_id="SGEthan/covis_toy",
49
+ filename="pytorch_model.bin"
50
+ )
51
+
52
+ config_path = hf_hub_download(
53
+ repo_id="SGEthan/covis_toy",
54
+ filename="config.json"
55
+ )
56
+
57
+ # Load configuration
58
+ with open(config_path, 'r') as f:
59
+ config = json.load(f)
60
+
61
+ # Initialize model (you'll need the COVIS class)
62
+ from models.covispose_model import COVIS
63
+
64
+ model = COVIS(
65
+ backbone=config['backbone'],
66
+ num_transformer_layers=config['num_transformer_layers'],
67
+ transformer_ffn_dim=config['transformer_ffn_dim']
68
+ )
69
+
70
+ # Load weights
71
+ checkpoint = torch.load(model_path, map_location='cpu')
72
+ if "model_state_dict" in checkpoint:
73
+ model.load_state_dict(checkpoint["model_state_dict"])
74
+ else:
75
+ model.load_state_dict(checkpoint)
76
+
77
+ model.eval()
78
+
79
+ # Use for inference
80
+ with torch.no_grad():
81
+ # Your inference code here
82
+ # outputs1, outputs2 = model(pano1_tensor, pano2_tensor)
83
+ pass
84
+ ```
85
+
86
+ ## Model Architecture
87
+
88
+ The CovisPose model consists of:
89
+
90
+ 1. **Backbone Network**: resnet50 for feature extraction
91
+ 2. **Transformer Encoder**: 6 layers for processing image features
92
+ 3. **Prediction Heads**:
93
+ - Covisibility mask prediction
94
+ - Relative pose estimation
95
+ - Boundary detection
96
+
97
+ ## Task Description
98
+
99
+ **CovisPose** estimates the relative pose between two panoramic images by:
100
+
101
+ 1. **Covisibility Estimation**: Predicting which parts of the images overlap
102
+ 2. **Pose Regression**: Estimating relative rotation and translation
103
+ 3. **Boundary Detection**: Finding floor-wall boundaries for scale estimation
104
+
105
+ ## Training Data
106
+
107
+ This model was trained on panoramic image pairs with:
108
+ - Relative pose annotations
109
+ - Covisibility masks
110
+ - Floor-wall boundary labels
111
+
112
+ ## Limitations
113
+
114
+ - Designed specifically for indoor panoramic images
115
+ - Requires significant visual overlap between image pairs for reliable pose estimation
116
+ - Performance may degrade on outdoor scenes or images with minimal overlap
117
+
118
+ ## Citation
119
+
120
+ If you use this model, please cite the CovisPose work:
121
+
122
+ ```bibtex
123
+ @article{covispose2024,
124
+ title={CovisPose: Co-visibility Pose Estimation for Panoramic Images},
125
+ author={Your Authors},
126
+ journal={Conference/Journal},
127
+ year={2024}
128
+ }
129
+ ```
130
+
131
+ ## License
132
+
133
+ This model is released under the MIT License.
134
+
135
+ ## Repository
136
+
137
+ - **Training Code**: Available in the original repository
138
+ - **Model Upload**: Generated automatically from local checkpoint
139
+
140
+ ---
141
+
142
+ *Model uploaded on 2025-06-27T09:05:36.254713 using upload_model.py*
config.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architecture": "CovisPose",
3
+ "framework": "pytorch",
4
+ "model_type": "pose_estimation",
5
+ "task": "panoramic_pose_estimation",
6
+ "input_size": [
7
+ 512,
8
+ 1024
9
+ ],
10
+ "backbone": "resnet50",
11
+ "num_transformer_layers": 6,
12
+ "transformer_ffn_dim": 2048,
13
+ "num_classes": null,
14
+ "library_name": "pytorch"
15
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3b5ed1a2abbd8701fc9fab0d5d6dfbb9e2835f60ea17c25778b2137566e4b62b
3
+ size 1460570852
requirements.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ torch>=1.9.0
2
+ torchvision>=0.10.0
3
+ numpy>=1.19.0
4
+ pillow>=8.0.0
5
+ huggingface_hub>=0.12.0
upload_metadata.json ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "original_checkpoint": "outputs/hf_training/checkpoints/final_model.pth",
3
+ "upload_date": "2025-06-27T09:05:36.254713",
4
+ "analysis": {
5
+ "checkpoint_keys": [
6
+ "epoch",
7
+ "model_state_dict",
8
+ "optimizer_state_dict",
9
+ "loss"
10
+ ],
11
+ "has_model_state_dict": true,
12
+ "has_optimizer_state_dict": true,
13
+ "has_training_metrics": false,
14
+ "has_val_metrics": false,
15
+ "has_args": false,
16
+ "training_info": {},
17
+ "model_info": {
18
+ "num_parameters": 121890467,
19
+ "parameter_keys": [
20
+ "covis_feature_extractor.feature_extractor.encoder.conv1.1.weight",
21
+ "covis_feature_extractor.feature_extractor.encoder.bn1.weight",
22
+ "covis_feature_extractor.feature_extractor.encoder.bn1.bias",
23
+ "covis_feature_extractor.feature_extractor.encoder.bn1.running_mean",
24
+ "covis_feature_extractor.feature_extractor.encoder.bn1.running_var",
25
+ "covis_feature_extractor.feature_extractor.encoder.bn1.num_batches_tracked",
26
+ "covis_feature_extractor.feature_extractor.encoder.layer1.0.conv1.weight",
27
+ "covis_feature_extractor.feature_extractor.encoder.layer1.0.bn1.weight",
28
+ "covis_feature_extractor.feature_extractor.encoder.layer1.0.bn1.bias",
29
+ "covis_feature_extractor.feature_extractor.encoder.layer1.0.bn1.running_mean"
30
+ ]
31
+ }
32
+ },
33
+ "config": {
34
+ "architecture": "CovisPose",
35
+ "framework": "pytorch",
36
+ "model_type": "pose_estimation",
37
+ "task": "panoramic_pose_estimation",
38
+ "input_size": [
39
+ 512,
40
+ 1024
41
+ ],
42
+ "backbone": "resnet50",
43
+ "num_transformer_layers": 6,
44
+ "transformer_ffn_dim": 2048,
45
+ "num_classes": null,
46
+ "library_name": "pytorch"
47
+ }
48
+ }