sebastianffx commited on
Commit
6f6b0f7
·
verified ·
1 Parent(s): 9d5b424

Sync with Github readme

Browse files
Files changed (1) hide show
  1. README.md +93 -9
README.md CHANGED
@@ -1,11 +1,78 @@
1
- ---
2
- license: mit
3
- tags:
4
- - image-feature-extraction
5
- ---
6
 
 
7
 
8
- ### Usage
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
 
10
  ```python
11
  from transformers import AutoImageProcessor, AutoModel
@@ -27,8 +94,8 @@ transform = v2.Compose(
27
  model = AutoModel.from_pretrained('kaiko-ai/midnight')
28
  ```
29
 
30
-
31
  ### Extract embeddings for classification
 
32
  ```python
33
  import torch
34
 
@@ -41,8 +108,8 @@ embedding = extract_classification_embedding(model(batch).last_hidden_state)
41
  print(f"Embedding shape: {embedding[0].shape}")
42
  ```
43
 
44
-
45
  ### Extract embeddings for segmentation
 
46
  ```python
47
  import math
48
  import torch
@@ -56,4 +123,21 @@ def extract_segmentation_embedding(tensor):
56
  batch = transform(image).unsqueeze(dim=0)
57
  embedding = extract_segmentation_embedding(model(batch).last_hidden_state)
58
  print(f"Embedding shape: {embedding[0].shape}")
59
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Kaiko midnight
2
+ Midnight - Training State-of-the-Art Pathology Foundation Models with Orders of Magnitude Less Data
 
 
 
3
 
4
+ This repository contains the model checkpoints for the Midnight-12k model presented in our paper titled "Training state-of-the-art pathology foundation models with orders of magnitude less data." Our approach achieves competitive performance compared to leading pathology foundation models (FMs), despite being trained on significantly fewer whole slide images (WSIs).
5
 
6
+ ## Overview
7
+
8
+ We propose a refined self-supervised training framework based on DINOv2 with modifications that optimize model performance specifically for computational pathology. Our main contributions include:
9
+
10
+ - Three novel pathology FMs trained with significantly reduced data (up to 100x fewer WSIs).
11
+ - Introduction of high-resolution post-training to enhance embedding quality.
12
+
13
+ ## Model Highlights
14
+
15
+ - **Midnight-12k**: Trained exclusively on the publicly available TCGA dataset (12k WSIs).
16
+ - **Midnight-92k**: Trained on TCGA and an additional proprietary dataset (PRV-80k).
17
+ - **Midnight-92k/392**: Our top-performing model fine-tuned with high-resolution post-training.
18
+
19
+ ## Training Datasets
20
+
21
+ | Dataset | WSIs | Source | Comment |
22
+ |---------|------|---------------|------------|
23
+ | TCGA | 12k | Public | FFPE only |
24
+ | NKI-80k | 80k | Proprietary | 10,141 patients, 31 organs |
25
+ | GTEx | 25k | Public | Healthy subjects |
26
+ | CPTAC | 7.2k | Public | Tumor samples from 13 cohorts |
27
+
28
+ ## Training Components
29
+
30
+ - **DINOv2**: Self-supervised training with [DINOv2](https://github.com/facebookresearch/dinov2).
31
+ - **KDE regularizer**: Replaced KoLeo in DINOv2 to ensure embedding diversity and training stability.
32
+ - **Online patching**: Efficient real-time extraction of informative tiles.
33
+ - **Color augmentation (HED)**: Robustness to stain variations.
34
+ - **Tile filtering**: Removal of low-informative tissue regions.
35
+
36
+ ## Evaluation
37
+
38
+ We comprehensively evaluated the models using two sets of open-source benchmarks:
39
+
40
+ - [eva](https://github.com/kaiko-ai/eva): For both tile (classification, segmentation) and slide-level tasks.
41
+ - [HEST](https://github.com/mahmoodlab/HEST): For gene expression prediction tasks (regression).
42
+
43
+ Our best model **Midnight-92k/392** consistently outperforms or matches leading models like Virchow2 and UNI-2.
44
+
45
+ ## Results Summary
46
+
47
+ | Model | AVG. | PCam 10 shots | BACH | BRACS | BreaKHis | CRC | Gleason | MHIST | PCam | Cam16 (small) | Panda (small) | CoNSeP | MoNuSAC | HEST |
48
+ |----------------------------------------------------------------|---------|-------------|------|------|----------|------|---------|-------|------|--------------------|---------------|--------|---------|------------|
49
+ | **[Midnight-92k/392](#usage)** | **0.779** | **0.900** | **0.904** | **0.646** | 0.802 | 0.966 | **0.807** | 0.828 | **0.951** | 0.883 | 0.651 | **0.662** | **0.708** | 0.415 |
50
+ | [UNI-2](https://huggingface.co/MahmoodLab/UNI2-h) | **0.777** | **0.885** | **0.924** | **0.651** | **0.863** | **0.970** | 0.777 | 0.829 | **0.951** | 0.884 | **0.666** | 0.626 | 0.644 | **0.431** |
51
+ | [Virchow2](https://huggingface.co/paige-ai/Virchow2) | **0.769** | 0.835 | 0.890 | 0.633 | 0.818 | 0.966 | **0.791** | **0.865** | 0.938 | **0.890** | 0.655 | 0.640 | 0.674 | 0.403 |
52
+ | **[Midnight-92k](#usage)** | 0.768 | **0.882** | 0.889 | 0.615 | 0.793 | **0.967** | **0.823** | 0.831 | 0.948 | 0.882 | 0.643 | 0.629 | 0.656 | **0.425** |
53
+ | **[Midnight-12k](#usage)** | 0.761 | 0.803 | **0.907** | 0.639 | 0.840 | **0.967** | 0.790 | 0.815 | 0.931 | 0.855 | 0.648 | 0.625 | 0.664 | 0.412 |
54
+ | [H-Optimus-0](https://huggingface.co/bioptimus/H-optimus-0) | 0.759 | 0.831 | 0.752 | 0.620 | 0.813 | 0.962 | 0.769 | **0.850** | 0.943 | **0.896** | **0.672** | **0.644** | **0.687** | **0.425** |
55
+ | [Kaiko-B8](https://github.com/kaiko-ai/towards_large_pathology_fms)| 0.757 | 0.799 | 0.876 | 0.641 | **0.842** | 0.960 | 0.761 | 0.830 | 0.920 | 0.847 | 0.650 | **0.644** | 0.686 | 0.391 |
56
+ | [Prov_GigaPath](https://github.com/prov-gigapath/prov-gigapath) | 0.757 | 0.853 | 0.794 | 0.626 | **0.846** | 0.959 | 0.727 | 0.831 | 0.944 | 0.887 | 0.657 | 0.628 | **0.688** | 0.405 |
57
+ | [Hibou-L](https://huggingface.co/histai/hibou-L) | 0.753 | 0.825 | 0.792 | **0.643** | 0.767 | 0.954 | 0.766 | **0.850** | **0.949** | 0.866 | **0.667** | **0.646** | 0.668 | 0.397 |
58
+ | [UNI](https://huggingface.co/MahmoodLab/UNI) | 0.753 | 0.833 | 0.797 | 0.613 | 0.808 | 0.954 | 0.759 | 0.841 | 0.937 | **0.899** | 0.662 | 0.627 | 0.662 | 0.391 |
59
+ | [Phikon](https://huggingface.co/owkin/phikon) | 0.727 | 0.826 | 0.744 | 0.579 | 0.715 | 0.946 | 0.743 | 0.824 | 0.919 | 0.861 | 0.648 | 0.624 | 0.644 | 0.377 |
60
+ | [Phikon-v2](https://huggingface.co/owkin/phikon-v2) | 0.722 | 0.756 | 0.737 | 0.607 | 0.725 | 0.953 | 0.753 | 0.796 | 0.900 | 0.867 | 0.634 | 0.626 | 0.645 | 0.391 |
61
+ | [Lunit](https://github.com/lunit-io/benchmark-ssl-pathology) | 0.720 | 0.763 | 0.785 | 0.627 | 0.759 | 0.943 | 0.758 | 0.785 | 0.905 | 0.836 | 0.604 | 0.600 | 0.630 | 0.362 |
62
+ | [vitg14 (nat. img.)](https://github.com/facebookresearch/dinov2) | 0.675 | 0.721 | 0.724 | 0.578 | 0.783 | 0.943 | 0.740 | 0.855 | 0.881 | 0.505 | 0.509 | 0.565 | 0.614 | 0.351 |
63
+ | [vitg14 (initial)](https://github.com/facebookresearch/dinov2) | 0.493 | 0.652 | 0.474 | 0.413 | 0.425 | 0.754 | 0.459 | 0.578 | 0.763 | 0.532 | 0.304 | 0.462 | 0.432 | 0.166 |
64
+
65
+
66
+ ## Model Weights
67
+ - Midnight-12k: [Publicly available](https://huggingface.co/kaiko-ai/midnight/tree/main) under the permissive MIT license.
68
+ - Midnight-92k & Midnight-92k/392: Trained on proprietary datasets and subject to restricted access.
69
+
70
+
71
+ ## Usage
72
+
73
+ **Midnight-12k** is publicly available at [https://huggingface.co/kaiko-ai/midnight](https://huggingface.co/kaiko-ai/midnight).
74
+
75
+ Our models are trained on 224x224 images normalized with a mean of (0.5, 0.5, 0.5) and a standard deviation of (0.5, 0.5, 0.5). Please ensure you apply these exact normalization parameters when preparing your datasets for embedding extraction.
76
 
77
  ```python
78
  from transformers import AutoImageProcessor, AutoModel
 
94
  model = AutoModel.from_pretrained('kaiko-ai/midnight')
95
  ```
96
 
 
97
  ### Extract embeddings for classification
98
+ For segmentation tasks, the model output corresponds to 16x16 patch tokens (derived from 224/14=16).
99
  ```python
100
  import torch
101
 
 
108
  print(f"Embedding shape: {embedding[0].shape}")
109
  ```
110
 
 
111
  ### Extract embeddings for segmentation
112
+
113
  ```python
114
  import math
115
  import torch
 
123
  batch = transform(image).unsqueeze(dim=0)
124
  embedding = extract_segmentation_embedding(model(batch).last_hidden_state)
125
  print(f"Embedding shape: {embedding[0].shape}")
126
+ ```
127
+
128
+
129
+ ## Citation
130
+ ```bibtex
131
+ @article{KDK2025,
132
+ title={Training state-of-the-art pathology foundation models with orders of magnitude less data},
133
+ author={Mikhail Karasikov, Joost van Doorn, Nicolas Känzig, Melis Erdal Cesur, Hugo Horlings, Robert Berke, Fei Tang, Sebastian Otálora},
134
+ year={2025},
135
+ journal={arXiv preprint}
136
+ }
137
+ ```
138
+
139
+ <br />
140
+
141
+ <div align="center">
142
+ <img src="https://github.com/kaiko-ai/midnight/blob/main/docs/images/kaiko-logo.png?raw=true" width="200">
143
+ </div>