ModernVBERT

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

QuentinJG authored a paper about 1 month ago

ViDoRe V3: A Comprehensive Evaluation of Retrieval Augmented Generation in Complex Real-World Scenarios

antonioloison authored a paper about 1 month ago

ViDoRe V3: A Comprehensive Evaluation of Retrieval Augmented Generation in Complex Real-World Scenarios

QuentinJG submitted a paper about 1 month ago

ViDoRe V3: A Comprehensive Evaluation of Retrieval Augmented Generation in Complex Real-World Scenarios

View all activity

Papers

ModernVBERT: Towards Smaller Visual Document Retrievers

View all Papers

Organization Card

Community About org cards

ModernVBERT: Towards Smaller Visual Document Retrievers

This organization contains all artifacts released with our preprint ModernVBERT: Towards Smaller Visual Document Retrievers.

Abstract

Multimodal embedding models are gaining prevalence, notably for document retrieval as efficient alternatives to text-only pipelines. These models are typically built by finetuning large vision–language decoders (VLMs) with contrastive losses on text–image pairs. In this work, we show that, while cost-efficient, this repurposing approach often bottlenecks retrieval performance. Through controlled experiments, we establish a principled recipe for improving visual document retrieval models. We notably measure the impact of attention masking, image resolution, modality alignment data regimes, and late interaction centered contrastive objectives which emerge as central performance factors. Building on these insights, we release ModernVBERT, a compact 250M-parameter vision–language encoder that outperforms models up to 10 times larger when finetuned on document retrieval tasks.

Ressources

HuggingFace Project Page: The HF page centralizing everything!
Preprint: The paper with all details!
Blog Post: The blog post introducing our release!
Finetuning Tutorial: A colab notebook to learn how to finetune ModernVBERT for document retrieval!
Codebase: The codebase with the training configs and scripts to train ModernVBERT.

Contact of the authors

Paul Teiletche: paul.teiletche@epfl.ch
Quentin Macé: quentin.mace@illuin.tech
Max Conti: max.conti@illuin.tech
Manuel Faysse: manuel.faysse@centralesupelec.fr

Citation

If you use any datasets or models from this organization in your research, please cite the original dataset as follows:

@misc{teiletche2025modernvbertsmallervisualdocument,
      title={ModernVBERT: Towards Smaller Visual Document Retrievers}, 
      author={Paul Teiletche and Quentin Macé and Max Conti and Antonio Loison and Gautier Viaud and Pierre Colombo and Manuel Faysse},
      year={2025},
      eprint={2510.01149},
      archivePrefix={arXiv},
      primaryClass={cs.IR},
      url={https://arxiv.org/abs/2510.01149}, 
}

Acknowledgments

This work was carried out within the framework of the LIAGORA "LabCom", a joint laboratory supported by the French National Research Agency (ANR) and established between ILLUIN Technology and the MICS laboratory of CentraleSupélec. This work was performed using HPC resources from IDRIS with grant AD011016393. We warmly thank Hippolyte Gisserot-Boukhlef and Nicolas Boizard for sharing the controlled experiments LM checkpoints, Antoine Chaffin for his feedback on the modality alignment codebase and insights on Ettin’s modeling, as well as Andi Marafioti, Orr Zohar, and Miquel Farré for their valuable input and help on gathering the modality alignment dataset.

Collections 2

models 7

datasets 2

ModernVBERT/toy-colpali-train-set

Viewer • Updated Oct 2, 2025 • 1.5k • 50

ModernVBERT/natcap

Viewer • Updated Jul 15, 2025 • 339k • 481 • 1

ModernVBERT

AI & ML interests

Recent Activity

Papers

ModernVBERT: Towards Smaller Visual Document Retrievers

Abstract

Ressources

Contact of the authors

Citation

Acknowledgments

Collections 2

ModernVBERT/colmodernvbert

ModernVBERT/colmodernvbert-base

vidore/colpali_train_set

rlhn/rlhn-680K

ModernVBERT/modernvbert

HuggingFaceM4/the_cauldron

HuggingFaceM4/Docmatix

ModernVBERT/modernvbert-embed

ModernVBERT/colmodernvbert

ModernVBERT/colmodernvbert-base

vidore/colpali_train_set

rlhn/rlhn-680K

ModernVBERT/modernvbert

HuggingFaceM4/the_cauldron

HuggingFaceM4/Docmatix

ModernVBERT/modernvbert-embed

models 7

ModernVBERT/modernvbert

ModernVBERT/colmodernvbert-merged

ModernVBERT/colmodernvbert-base

ModernVBERT/modernvbert-embed

ModernVBERT/bimodernvbert

ModernVBERT/colmodernvbert

ModernVBERT/ablation_checkpoints

datasets 2

ModernVBERT/toy-colpali-train-set

ModernVBERT/natcap

AI & ML interests

Recent Activity

Papers

Team members 6

ModernVBERT: Towards Smaller Visual Document Retrievers

Abstract

Ressources

Contact of the authors

Citation

Acknowledgments

Collections 2

models 7 Sort: Recently updated

datasets 2 Sort: Recently updated

models 7

datasets 2