OceanSAR-1 / README.md
Balocre's picture
Update README.md
a456368 verified
---
library_name: transformers
tags:
- resnet
- SAR
- RADAR
- EO
- backbone
- ocean
- wind
- sentinel-1
license: apache-2.0
pipeline_tag: image-feature-extraction
---
# Model Card for OceanSAR-1
## Model Details
<img src="OceanSAR-1-logo.png" width=400>
### Model Description
OceanSAR-1 is the first foundation model in the OceanSAR family, specifically designed for Synthetic Aperture Radar (SAR) imagery analysis, with a focus on ocean observation. The model is trained using a novel dynamic dataset pruning strategy that enhances training efficiency and feature quality.
- **Developed by:** Thomas Kerdreux, Alexandre Tuel @ [Galeio](http://galeio.fr)
- **Deployed by:** Antoine Audras @ [Galeio](http://galeio.fr)
- **Model type:** Vision Foundation Model (ResNet50/ViT variants)
- **License:** Apache License 2.0
- **Training data:** Sentinel-1 Wave Mode (WV) SAR images (2015-2024)
- **Training regime:** DINO self-supervised learning with dynamic dataset pruning
## Uses
### Direct Use
The model is intended to be used as a feature extractor for SAR image analysis, particularly for ocean observation tasks. It can be used for:
- Feature extraction from SAR images
- Transfer learning for downstream tasks
### Downstream Use
The model has been validated on three downstream tasks:
1. **TenGeoP Classification**: Classification of 10 geophysical phenomena in SAR images
2. **Significant Wave Height Estimation**: Regression task for ocean wave height prediction
3. **Wind Speed Prediction**: Regression task for surface wind speed estimation
## How to Use
```python
import torch
from transformers import AutoModel
# Load model and processor
model = AutoModel.from_pretrained("galeio-research/OceanSAR-1")
# Prepare your SAR image (should be single-channel VV polarization)
# Here using random data as example
dummy_image = torch.randn(1, 1, 256, 256) # (C, H, W)
# Extract features
with torch.no_grad():
outputs = model(dummy_image)
features = outputs.pooler_output # Shape: (1, 2048) for ResNet50
```
## Training Details
### Training Data
- **Dataset:** Sentinel-1 Wave Mode (WV) SAR images
- **Time period:** 2015-2024
- **Size:** ~12 million images
- **Preprocessing:**
- Spatial downsampling to 50m resolution
- Dynamic dataset pruning for diversity and balancedness
- Excluded validation images from training set
### Dynamic Dataset Pruning
The model uses a novel dynamic dataset pruning strategy that:
- Maximizes dataset diversity and balancedness
- Reduces computational costs
- Improves model performance on downstream tasks
- Works without requiring a pre-existing feature extractor
## Evaluation
### Results
The model achieves state-of-the-art performance on three downstream tasks (linear probing):
1. **TenGeoP Classification**:
- ResNet50: 75.5% accuracy
- ViT-S/16: 78.6% accuracy
- ViT-S/8: 82.1% accuracy
- ViT-B/8: 83.6% accuracy
2. **Significant Wave Height Estimation**:
- RMSE: 0.63-0.72m (depending on architecture)
3. **Wind Speed Prediction**:
- RMSE: 1.37-1.43 m/s (depending on architecture)
For commercial deployments or to access optimized model variants for specific operational needs, feel free to reach out to discuss licensing and support options.
## Technical Specifications
### Hardware Requirements
- GPU with at least 8GB VRAM recommended
### Dependencies
- PyTorch >= 1.8.0
- Transformers >= 4.30.0
- torchvision >= 0.9.0
### Input Specifications
- Input size: 256x256 pixels
- Single channel (VV polarization)
- Normalized pixel values
- SAR images from Sentinel-1 Wave Mode
## Citation
**BibTeX:**
```bibtex
@article{kerdreux2025efficientselfsupervisedlearningearth,
title={Efficient Self-Supervised Learning for Earth Observation via Dynamic Dataset Curation},
author={Kerdreux, Thomas and Tuel, Alexandre and Febvre, Quentin and Mouche, Alexis and Chapron, Bertrand},
journal={arXiv preprint arXiv:2504.06962},
year={2025},
eprint={2504.06962},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2504.06962},
}
```
## Acknowledgements
This work was granted access to the HPC resources of IDRIS and TGCC under the allocation 2025-[A0171015666] made by GENCI.