nielsr HF Staff commited on
Commit
b0dc108
·
verified ·
1 Parent(s): 42e6c57

Fix pipeline tag, add library_name

Browse files

This PR updates the model card with the correct `pipeline_tag` and adds the `library_name`.

The `pipeline_tag` is changed from `depth-estimation` to `image-to-image` because the model processes an image and outputs a depth map image. The `library_name` is added as `diffusers` to correctly identify the library used for this model. The Hugging Face paper link is already present.

Files changed (1) hide show
  1. README.md +85 -84
README.md CHANGED
@@ -1,84 +1,85 @@
1
- ---
2
- license: openrail++
3
- language:
4
- - en
5
- pipeline_tag: depth-estimation
6
- pinned: true
7
- tags:
8
- - depth estimation
9
- - image analysis
10
- - computer vision
11
- - in-the-wild
12
- - zero-shot
13
- ---
14
-
15
- <h1 align="center">Marigold Depth v1-1 Model Card</h1>
16
-
17
- <p align="center">
18
- <a title="Image Depth" href="https://huggingface.co/spaces/prs-eth/marigold" target="_blank" rel="noopener noreferrer" style="display: inline-block;">
19
- <img src="https://img.shields.io/badge/%F0%9F%A4%97%20Image%20Depth%20-Demo-yellow" alt="Image Depth">
20
- </a>
21
- <a title="diffusers" href="https://huggingface.co/docs/diffusers/using-diffusers/marigold_usage" target="_blank" rel="noopener noreferrer" style="display: inline-block;">
22
- <img src="https://img.shields.io/badge/%F0%9F%A4%97%20diffusers%20-Integration%20🧨-yellow" alt="diffusers">
23
- </a>
24
- <a title="Github" href="https://github.com/prs-eth/marigold" target="_blank" rel="noopener noreferrer" style="display: inline-block;">
25
- <img src="https://img.shields.io/github/stars/prs-eth/marigold?label=GitHub%20%E2%98%85&logo=github&color=C8C" alt="Github">
26
- </a>
27
- <a title="Website" href="https://marigoldcomputervision.github.io/" target="_blank" rel="noopener noreferrer" style="display: inline-block;">
28
- <img src="https://img.shields.io/badge/%E2%99%A5%20Project%20-Website-blue" alt="Website">
29
- </a>
30
- <a title="arXiv" href="https://arxiv.org/abs/2505.09358" target="_blank" rel="noopener noreferrer" style="display: inline-block;">
31
- <img src="https://img.shields.io/badge/%F0%9F%93%84%20Read%20-Paper-AF3436" alt="arXiv">
32
- </a>
33
- <a title="Social" href="https://twitter.com/antonobukhov1" target="_blank" rel="noopener noreferrer" style="display: inline-block;">
34
- <img src="https://img.shields.io/twitter/follow/:?label=Subscribe%20for%20updates!" alt="Social">
35
- </a>
36
- <a title="License" href="https://huggingface.co/stabilityai/stable-diffusion-2/blob/main/LICENSE-MODEL" target="_blank" rel="noopener noreferrer" style="display: inline-block;">
37
- <img src="https://img.shields.io/badge/License-OpenRAIL++-929292" alt="License">
38
- </a>
39
- </p>
40
-
41
- This is a model card for the `marigold-depth-v1-1` model for monocular depth estimation from a single image.
42
- The model is fine-tuned from the `stable-diffusion-2` [model](https://huggingface.co/stabilityai/stable-diffusion-2) as
43
- described in our papers:
44
- - [CVPR'2024 paper](https://arxiv.org/abs/2312.02145) titled "Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation"
45
- - [Jounal extension](https://www.arxiv.org/abs/2505.09358) titled "Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis"
46
-
47
- ### Using the model
48
-
49
- - Play with the interactive [Hugging Face Spaces demo](https://huggingface.co/spaces/prs-eth/marigold): check out how the model works with example images or upload your own.
50
- - Use it with [diffusers](https://huggingface.co/docs/diffusers/using-diffusers/marigold_usage) to compute the results with a few lines of code.
51
- - Get to the bottom of things with our [official codebase](https://github.com/prs-eth/marigold).
52
-
53
- ## Model Details
54
- - **Developed by:** [Bingxin Ke](http://www.kebingxin.com/), [Kevin Qu](https://ch.linkedin.com/in/kevin-qu-b3417621b), [Tianfu Wang](https://tianfwang.github.io/), [Nando Metzger](https://nandometzger.github.io/), [Shengyu Huang](https://shengyuh.github.io/), [Bo Li](https://www.linkedin.com/in/bobboli0202), [Anton Obukhov](https://www.obukhov.ai/), [Konrad Schindler](https://scholar.google.com/citations?user=FZuNgqIAAAAJ).
55
- - **Model type:** Generative latent diffusion-based affine-invariant monocular depth estimation from a single image.
56
- - **Language:** English.
57
- - **License:** [CreativeML Open RAIL++-M License](https://huggingface.co/stabilityai/stable-diffusion-2/blob/main/LICENSE-MODEL).
58
- - **Model Description:** This model can be used to generate an estimated depth map of an input image.
59
- - **Resolution**: Even though any resolution can be processed, the model inherits the base diffusion model's effective resolution of roughly **768** pixels.
60
- This means that for optimal predictions, any larger input image should be resized to make the longer side 768 pixels before feeding it into the model.
61
- - **Steps and scheduler**: This model was designed for usage with the **DDIM** scheduler and between **1 and 50** denoising steps.
62
- - **Outputs**:
63
- - **Affine-invariant depth map**: The predicted values are between 0 and 1, interpolating between the near and far planes of the model's choice.
64
- - **Uncertainty map**: Produced only when multiple predictions are ensembled with ensemble size larger than 2.
65
- - **Resources for more information:** [Project Website](https://marigoldcomputervision.github.io/), [Paper](https://arxiv.org/abs/2505.09358), [Code](https://github.com/prs-eth/marigold).
66
- - **Cite as:**
67
-
68
- ```bibtex
69
- @misc{ke2025marigold,
70
- title={Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis},
71
- author={Bingxin Ke and Kevin Qu and Tianfu Wang and Nando Metzger and Shengyu Huang and Bo Li and Anton Obukhov and Konrad Schindler},
72
- year={2025},
73
- eprint={2505.09358},
74
- archivePrefix={arXiv},
75
- primaryClass={cs.CV}
76
- }
77
-
78
- @InProceedings{ke2023repurposing,
79
- title={Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation},
80
- author={Bingxin Ke and Anton Obukhov and Shengyu Huang and Nando Metzger and Rodrigo Caye Daudt and Konrad Schindler},
81
- booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
82
- year={2024}
83
- }
84
- ```
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: openrail++
5
+ library_name: diffusers
6
+ pipeline_tag: image-to-image
7
+ tags:
8
+ - depth estimation
9
+ - image analysis
10
+ - computer vision
11
+ - in-the-wild
12
+ - zero-shot
13
+ pinned: true
14
+ ---
15
+
16
+ <h1 align="center">Marigold Depth v1-1 Model Card</h1>
17
+
18
+ <p align="center">
19
+ <a title="Image Depth" href="https://huggingface.co/spaces/prs-eth/marigold" target="_blank" rel="noopener noreferrer" style="display: inline-block;">
20
+ <img src="https://img.shields.io/badge/%F0%9F%A4%97%20Image%20Depth%20-Demo-yellow" alt="Image Depth">
21
+ </a>
22
+ <a title="diffusers" href="https://huggingface.co/docs/diffusers/using-diffusers/marigold_usage" target="_blank" rel="noopener noreferrer" style="display: inline-block;">
23
+ <img src="https://img.shields.io/badge/%F0%9F%A4%97%20diffusers%20-Integration%20🧨-yellow" alt="diffusers">
24
+ </a>
25
+ <a title="Github" href="https://github.com/prs-eth/marigold" target="_blank" rel="noopener noreferrer" style="display: inline-block;">
26
+ <img src="https://img.shields.io/github/stars/prs-eth/marigold?label=GitHub%20%E2%98%85&logo=github&color=C8C" alt="Github">
27
+ </a>
28
+ <a title="Website" href="https://marigoldcomputervision.github.io/" target="_blank" rel="noopener noreferrer" style="display: inline-block;">
29
+ <img src="https://img.shields.io/badge/%E2%99%A5%20Project%20-Website-blue" alt="Website">
30
+ </a>
31
+ <a title="arXiv" href="https://arxiv.org/abs/2505.09358" target="_blank" rel="noopener noreferrer" style="display: inline-block;">
32
+ <img src="https://img.shields.io/badge/%F0%9F%93%84%20Read%20-Paper-AF3436" alt="arXiv">
33
+ </a>
34
+ <a title="Social" href="https://twitter.com/antonobukhov1" target="_blank" rel="noopener noreferrer" style="display: inline-block;">
35
+ <img src="https://img.shields.io/twitter/follow/:?label=Subscribe%20for%20updates!" alt="Social">
36
+ </a>
37
+ <a title="License" href="https://huggingface.co/stabilityai/stable-diffusion-2/blob/main/LICENSE-MODEL" target="_blank" rel="noopener noreferrer" style="display: inline-block;">
38
+ <img src="https://img.shields.io/badge/License-OpenRAIL++-929292" alt="License">
39
+ </a>
40
+ </p>
41
+
42
+ This is a model card for the `marigold-depth-v1-1` model for monocular depth estimation from a single image.
43
+ The model is fine-tuned from the `stable-diffusion-2` [model](https://huggingface.co/stabilityai/stable-diffusion-2) as
44
+ described in our papers:
45
+ - [CVPR'2024 paper](https://arxiv.org/abs/2312.02145) titled "Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation"
46
+ - [Journal extension](https://huggingface.co/papers/2505.09358) titled "Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis"
47
+
48
+ ### Using the model
49
+
50
+ - Play with the interactive [Hugging Face Spaces demo](https://huggingface.co/spaces/prs-eth/marigold): check out how the model works with example images or upload your own.
51
+ - Use it with [diffusers](https://huggingface.co/docs/diffusers/using-diffusers/marigold_usage) to compute the results with a few lines of code.
52
+ - Get to the bottom of things with our [official codebase](https://github.com/prs-eth/marigold).
53
+
54
+ ## Model Details
55
+ - **Developed by:** [Bingxin Ke](http://www.kebingxin.com/), [Kevin Qu](https://ch.linkedin.com/in/kevin-qu-b3417621b), [Tianfu Wang](https://tianfwang.github.io/), [Nando Metzger](https://nandometzger.github.io/), [Shengyu Huang](https://shengyuh.github.io/), [Bo Li](https://www.linkedin.com/in/bobboli0202), [Anton Obukhov](https://www.obukhov.ai/), [Konrad Schindler](https://scholar.google.com/citations?user=FZuNgqIAAAAJ).
56
+ - **Model type:** Generative latent diffusion-based affine-invariant monocular depth estimation from a single image.
57
+ - **Language:** English.
58
+ - **License:** [CreativeML Open RAIL++-M License](https://huggingface.co/stabilityai/stable-diffusion-2/blob/main/LICENSE-MODEL).
59
+ - **Model Description:** This model can be used to generate an estimated depth map of an input image.
60
+ - **Resolution**: Even though any resolution can be processed, the model inherits the base diffusion model's effective resolution of roughly **768** pixels.
61
+ This means that for optimal predictions, any larger input image should be resized to make the longer side 768 pixels before feeding it into the model.
62
+ - **Steps and scheduler**: This model was designed for usage with the **DDIM** scheduler and between **1 and 50** denoising steps.
63
+ - **Outputs**:
64
+ - **Affine-invariant depth map**: The predicted values are between 0 and 1, interpolating between the near and far planes of the model's choice.
65
+ - **Uncertainty map**: Produced only when multiple predictions are ensembled with ensemble size larger than 2.
66
+ - **Resources for more information:** [Project Website](https://marigoldcomputervision.github.io/), [Paper](https://arxiv.org/abs/2505.09358), [Code](https://github.com/prs-eth/marigold).
67
+ - **Cite as:**
68
+
69
+ ```bibtex
70
+ @misc{ke2025marigold,
71
+ title={Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis},
72
+ author={Bingxin Ke and Kevin Qu and Tianfu Wang and Nando Metzger and Shengyu Huang and Bo Li and Anton Obukhov and Konrad Schindler},
73
+ year={2025},
74
+ eprint={2505.09358},
75
+ archivePrefix={arXiv},
76
+ primaryClass={cs.CV}
77
+ }
78
+
79
+ @InProceedings{ke2023repurposing,
80
+ title={Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation},
81
+ author={Bingxin Ke and Anton Obukhov and Shengyu Huang and Nando Metzger and Rodrigo Caye Daudt and Konrad Schindler},
82
+ booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
83
+ year={2024}
84
+ }
85
+ ```