LiheYoung commited on
Commit
4ebcd7f
1 Parent(s): 4e6ee43

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +113 -3
README.md CHANGED
@@ -1,3 +1,113 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+
5
+ # Depth Anything V2 for Metric Depth Estimation
6
+
7
+ # Pre-trained Models
8
+
9
+ We provide **six metric depth models** of three scales for indoor and outdoor scenes, respectively.
10
+
11
+ | Base Model | Params | Indoor (Hypersim) | Outdoor (Virtual KITTI 2) |
12
+ |:-|-:|:-:|:-:|
13
+ | Depth-Anything-V2-Small | 24.8M | [Download](https://huggingface.co/depth-anything/Depth-Anything-V2-Metric-Hypersim-Small/resolve/main/depth_anything_v2_metric_hypersim_vits.pth?download=true) | [Download](https://huggingface.co/depth-anything/Depth-Anything-V2-Metric-VKITTI-Small/resolve/main/depth_anything_v2_metric_vkitti_vits.pth?download=true) |
14
+ | Depth-Anything-V2-Base | 97.5M | [Download](https://huggingface.co/depth-anything/Depth-Anything-V2-Metric-Hypersim-Base/resolve/main/depth_anything_v2_metric_hypersim_vitb.pth?download=true) | [Download](https://huggingface.co/depth-anything/Depth-Anything-V2-Metric-VKITTI-Base/resolve/main/depth_anything_v2_metric_vkitti_vitb.pth?download=true) |
15
+ | Depth-Anything-V2-Large | 335.3M | [Download](https://huggingface.co/depth-anything/Depth-Anything-V2-Metric-Hypersim-Large/resolve/main/depth_anything_v2_metric_hypersim_vitl.pth?download=true) | [Download](https://huggingface.co/depth-anything/Depth-Anything-V2-Metric-VKITTI-Large/resolve/main/depth_anything_v2_metric_vkitti_vitl.pth?download=true) |
16
+
17
+ *We recommend to first try our larger models (if computational cost is affordable) and the indoor version.*
18
+
19
+ ## Usage
20
+
21
+ ### Prepraration
22
+
23
+ ```bash
24
+ git clone https://github.com/DepthAnything/Depth-Anything-V2
25
+ cd Depth-Anything-V2/metric_depth
26
+ pip install -r requirements.txt
27
+ ```
28
+
29
+ Download the checkpoints listed [here](#pre-trained-models) and put them under the `checkpoints` directory.
30
+
31
+ ### Use our models
32
+ ```python
33
+ import cv2
34
+ import torch
35
+
36
+ from depth_anything_v2.dpt import DepthAnythingV2
37
+
38
+ model_configs = {
39
+ 'vits': {'encoder': 'vits', 'features': 64, 'out_channels': [48, 96, 192, 384]},
40
+ 'vitb': {'encoder': 'vitb', 'features': 128, 'out_channels': [96, 192, 384, 768]},
41
+ 'vitl': {'encoder': 'vitl', 'features': 256, 'out_channels': [256, 512, 1024, 1024]}
42
+ }
43
+
44
+ encoder = 'vitl' # or 'vits', 'vitb'
45
+ dataset = 'hypersim' # 'hypersim' for indoor model, 'vkitti' for outdoor model
46
+ max_depth = 20 # 20 for indoor model, 80 for outdoor model
47
+
48
+ model = DepthAnythingV2(**{**model_configs[encoder], 'max_depth': max_depth})
49
+ model.load_state_dict(torch.load(f'checkpoints/depth_anything_v2_metric_{dataset}_{encoder}.pth', map_location='cpu'))
50
+ model.eval()
51
+
52
+ raw_img = cv2.imread('your/image/path')
53
+ depth = model.infer_image(raw_img) # HxW depth map in meters in numpy
54
+ ```
55
+
56
+ ### Running script on images
57
+
58
+ Here, we take the `vitl` encoder as an example. You can also use `vitb` or `vits` encoders.
59
+
60
+ ```bash
61
+ # indoor scenes
62
+ python run.py \
63
+ --encoder vitl \
64
+ --load-from checkpoints/depth_anything_v2_metric_hypersim_vitl.pth \
65
+ --max-depth 20 \
66
+ --img-path <path> --outdir <outdir> [--input-size <size>] [--save-numpy]
67
+
68
+ # outdoor scenes
69
+ python run.py \
70
+ --encoder vitl \
71
+ --load-from checkpoints/depth_anything_v2_metric_vkitti_vitl.pth \
72
+ --max-depth 80 \
73
+ --img-path <path> --outdir <outdir> [--input-size <size>] [--save-numpy]
74
+ ```
75
+
76
+ ### Project 2D images to point clouds:
77
+
78
+ ```bash
79
+ python depth_to_pointcloud.py \
80
+ --encoder vitl \
81
+ --load-from checkpoints/depth_anything_v2_metric_hypersim_vitl.pth \
82
+ --max-depth 20 \
83
+ --img-path <path> --outdir <outdir>
84
+ ```
85
+
86
+ ### Reproduce training
87
+
88
+ Please first prepare the [Hypersim](https://github.com/apple/ml-hypersim) and [Virtual KITTI 2](https://europe.naverlabs.com/research/computer-vision/proxy-virtual-worlds-vkitti-2/) datasets. Then:
89
+
90
+ ```bash
91
+ bash dist_train.sh
92
+ ```
93
+
94
+
95
+ ## Citation
96
+
97
+ If you find this project useful, please consider citing:
98
+
99
+ ```bibtex
100
+ @article{depth_anything_v2,
101
+ title={Depth Anything V2},
102
+ author={Yang, Lihe and Kang, Bingyi and Huang, Zilong and Zhao, Zhen and Xu, Xiaogang and Feng, Jiashi and Zhao, Hengshuang},
103
+ journal={arXiv:2406.09414},
104
+ year={2024}
105
+ }
106
+
107
+ @inproceedings{depth_anything_v1,
108
+ title={Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data},
109
+ author={Yang, Lihe and Kang, Bingyi and Huang, Zilong and Xu, Xiaogang and Feng, Jiashi and Zhao, Hengshuang},
110
+ booktitle={CVPR},
111
+ year={2024}
112
+ }
113
+ ```