SAIL-Recon / README copy.md
hengli
first
b7f83b0

A newer version of the Gradio SDK is available: 5.49.1

Upgrade

SAIL-Recon: Large SfM by Augmenting Scene Regression with Localization

arXiv Project Page

HKUST Spatial Artificial Intelligence Lab; Horizon Robotics

Junyuan Deng, Heng Li, Tao Xie, Weiqiang Ren, Qian Zhang, Ping Tan, Xiaoyang Guo

pic1

Overview

Sail-Recon is a feed-forward Transformer that scales neural scene regression to large-scale Structure-from-Motion by augmenting it with visual localization. From a few anchor views, it constructs a global latent scene representation that encodes both geometry and appearance. Conditioned on this representation, the network directly regresses camera poses, intrinsics, depth maps, and scene coordinate maps for thousands of images in minutes, enabling precise and robust reconstruction without iterative optimization.

TODO

  • Inference Code Release
  • Gradio Demo
  • Evaluation Script

Quick Start

First, clone this repository to your local machine, and install the dependencies (torch, torchvision, numpy, Pillow, and huggingface_hub) following VGGT.

git clone https://github.com/HKUST-SAIL/sail-recon.git
cd sail-recon
pip install -e .

You can download the demo image (e.g., Barn from Tanks & Temples) and put the images in examples/demo_image.

Now, you can try the model demo:

# Images
python demo.py --img_dir path/to/your/images --out_dir outputs
# Video
python demo.py --vid_dir path/to/your/images --out_dir outputs

You can find the ply file and camera pose under outputs.

We also provide a Gradio demo for easier usage. You can run the demo by:

python demo_gradio.py

Please note that the Gradio demo is slower than demo.py due to the visualization part.

Evaluation

Please refer to this for more details.

Acknowledgements

Thanks to these great repositories:

ACE0 for the PSNR evaluation;

VGGT for the template of github, gradio and visualization;

Fast3R for the training data processing and some utility functions;

And many other inspiring works in the community.

If you find this project useful in your research, please consider citing:

@article{dengli2025sail,
  title={SAIL-Recon: Large SfM by Augmenting Scene Regression with Localization},
  author={Deng, Junyuan and Li, Heng and Xie, Tao and Ren, Weiqiang and Zhang, Qian and Tan, Ping and Guo, Xiaoyang},
  journal={arXiv preprint arXiv:2508.17972},
  year={2025}
}

License

See the LICENSE file for details about the license under which this code is made available.

Please see the license of VGGT about the other code used in this project.

Please see the license of ACE0 about the evaluation used in this project.

Please see the license of Fast3R about the utility functions used in this project.