Align3R: Aligned Monocular Depth Estimation for Dynamic Videos Jiahao Lu*, Tianyu Huang*, Peng Li, Zhiyang Dou, Cheng Lin, Zhiming Cui, Zhen Dong, Sai-Kit Yeung, Wenping Wang, Yuan Liu Arxiv, 2024.

Align3R estimates temporally consistent video depth, dynamic point clouds, and camera poses from monocular videos.

@article{lu2024align3r,
  title={Align3R: Aligned Monocular Depth Estimation for Dynamic Videos},Jiahao Lu, Tianyu Huang, Peng Li, Zhiyang Dou, Cheng Lin, Zhiming Cui, Zhen Dong, Sai-Kit Yeung, Wenping Wang, Yuan Liu
  author={Lu, Jiahao and Huang, Tianyu and Li, Peng and Dou, Zhiyang and Lin, Cheng and Cui, Zhiming and Dong, Zhen and Yeung, Sai-Kit and Wang, Wenping and Liu,Yuan},
  journal={arXiv preprint arXiv:2412.03079},
  year={2024}
}

How to use

First, install Align3R. To load the model:

from dust3r.model import AsymmetricCroCo3DStereo
import torch
model = AsymmetricCroCo3DStereo.from_pretrained("cyun9286/Align3R_DepthAnythingV2_ViTLarge_BaseDecoder_512_dpt")
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
Downloads last month
230
Safetensors
Model size
603M params
Tensor type
F32
ยท
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the HF Inference API does not support align3r models with pipeline type image-to-3d

Space using cyun9286/Align3R_DepthAnythingV2_ViTLarge_BaseDecoder_512_dpt 1