CLOSP
CLOSP (Contrastive Language Optical SAR Pretraining) is a multimodal architecture designed for text-to-image retrieval. It creates a unified embedding space for text, Sentinel-2 (MSI), and Sentinel-1 (SAR) data.
This repository contains all the separate visual encoders in PyTorch format.
Model Details
The model uses three separate encoders: one for text, one for Sentinel-1 (SAR) data, and one for Sentinel-2 (MSI) data. During training, it uses a contrastive objective to align the textual embeddings with the corresponding visual embeddings (either SAR or MSI).
- Developed by: Daniele Rege Cambrin
- Model type: CLOSP
- Language(s) (NLP): english
- License: OpenRAIL
- Repository: GitHub
- Paper: ArXiv
Citation
@misc{cambrin2025texttoremotesensingimageretrievalrgbsources,
title={Text-to-Remote-Sensing-Image Retrieval beyond RGB Sources},
author={Daniele Rege Cambrin and Lorenzo Vaiani and Giuseppe Gallipoli and Luca Cagliero and Paolo Garza},
year={2025},
eprint={2507.10403},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2507.10403},
}
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Dataset used to train DarthReca/CLOSP-Visual
Collection including DarthReca/CLOSP-Visual
Collection
5 items
•
Updated