Spaces:

manu02
/

DINOv3-Interactive-Patch-Cosine-Similarity

Running

App Files Files Community

DINOv3-Interactive-Patch-Cosine-Similarity / README.md

manu02

Update README.md

a3a77b3 verified about 1 month ago

preview code

raw

history blame contribute delete

6.68 kB

A newer version of the Gradio SDK is available: 5.47.0

Upgrade

metadata

title: DINOv3 Web/Sat Interactive Similarity
emoji: 🦖
colorFrom: yellow
colorTo: gray
sdk: gradio
sdk_version: 5.43.1
app_file: app.py
pinned: false
license: mit
short_description: Visualize image patch similarity like in DINOv3 presentation

DINOv3 Patch Similarity Viewer Github Repo

Note: This README and repository are for educational purposes. The creation of this repo was inspired by the DINOv3 paper to help visualize and understand the output of the model.

Purpose

This repository provides interactive tools to visualize and explore patch-wise similarity in images using the DINOv3 vision transformer model. It is designed for researchers, students, and practitioners interested in understanding how self-supervised vision transformers perceive and relate different regions of an image.

About DINOv3

Paper: DINOv3: Self-supervised Vision Transformers with Enormous Teacher Models
Meta Research Page: Meta DINOv3 Publication
Official GitHub: facebookresearch/dinov3

Note:
The DINOv3 model weights require access approval.
You can request access via the Meta Research page or by selecting the desired model on Hugging Face model collection.

Features

Interactive Visualization: Click on image patches or use arrow keys to explore patch similarity heatmaps.
Single or Two-Image Mode: If one image is specified, shows self-similarity. If two images are specified, shows both self-similarity and cross-image similarity overlays interactively.
Image Preprocessing: Loads and pads images without resizing, preserving the original aspect ratio.
Cosine Similarity Calculation: Computes and visualizes cosine similarity between image patches.
Robust Fallback: If an image URL fails to load, a default image is used.

Installation

Install dependencies with:

pip install -r requirements.txt

Model Selection

You can choose from several DINOv3 models available on Hugging Face (click to view each model card):

LVD-1689M Dataset (Web data)

SAT-493M Dataset (Satellite data)

ViT
- facebook/dinov3-vitl16-pretrain-sat493m
- facebook/dinov3-vit7b16-pretrain-sat493m

Usage

Gradio app

Run the Gradio app:

python app.py

After runnig the app, go to http://localhost:7860/ to see the app running.

Then:

Choose Dataset and model name
For Single image similarity:
- Choose only one file or URL
For 2 image similarity:
- Choose images from file and/or URL
Click button "Initialize / Update "
Select the desired patch from the image
Watch the results

Note: Overlay alpha is the intensity of the overlay of patches on top of image

Python Script

Run the interactive viewer with the default COCO image:

python DINOv3CosSimilarity.py

Single Image Mode

Specify your own image (local path or URL):

python DINOv3CosSimilarity.py --image path/to/your/image.jpg
python DINOv3CosSimilarity.py --image https://yourdomain.com/image.png

Two Image Mode

Specify two images (local paths or URLs):

python DINOv3CosSimilarity.py --image1 path/to/image1.jpg --image2 path/to/image2.jpg
python DINOv3CosSimilarity.py --image1 https://yourdomain.com/image1.png --image2 https://yourdomain.com/image2.png

Model Selection

Specify the model with --model (default is vits16):

python DINOv3CosSimilarity.py --model facebook/dinov3-vitb16-pretrain-lvd1689m

Other Options

--show_grid : Draw patch grid
--annotate_indices : Write patch indices on cells
--overlay_alpha <float> : Set heatmap alpha (default 0.55)
--patch_size <int> : Override patch size (default: model's patch size)

Controls

Mouse click to select a patch
Arrow keys to move selection
'1', '2', or 't' to switch active image (in two-image mode)
'q' to quit

Demo Single Image

Demo 2 Images

Jupyter Notebook

Open PatchCosSimilarity.ipynb in Jupyter Notebook.
Run the cells to load an image and visualize patch similarities.
Set url1 for single-image mode, or both url1 and url2 for two-image mode.
If an image fails to load, a default image will be used automatically.
Set the model_id variable to any of the models listed above (see commented lines at the top of the notebook).

Notebook Controls:

Mouse click to select a patch
Arrow keys to move selection
'1', '2', or 't' to switch active image (in two-image mode)

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments

This project utilizes the DINOv3 model from Hugging Face's Transformers library, along with PyTorch, Matplotlib, and Pillow