MedM-VL-2D-3B-en

Introduction

A medical LVLM, trained on English data, accepts text and a single 2D medical image as input, and text-based results as output. enabling tasks such as report generation, medical VQA, referring expression comprehension, referring expression generation and image classification.

Here are the evaluation results on Uni-Med:

Method medmnist_derma medmnist_organs medpix mimic pathvqa samed_identify samed_refer slake_identify slake_refer slakevqa
Med-Flamingo 1.15 8.90 8.14 23.25 33.38 - - - - 21.51
RadFM 5.14 18.90 - 6.81 24.83 - - - - 81.66
LLaVA-Med 25.84 66.80 15.11 20.43 37.79 45.83 8.64 27.21 4.07 33.69
MedM-VL-2D-3B-en 85.49 80.68 14.45 25.50 64.06 74.11 26.42 82.94 33.51 85.86

Quickstart

Please refer to MedM-VL.

Downloads last month
8
Safetensors
Model size
3.65B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.