MedM-VL-2D-3B-en
Introduction
A medical LVLM, trained on English data, accepts text and a single 2D medical image as input, and text-based results as output. enabling tasks such as report generation, medical VQA, referring expression comprehension, referring expression generation and image classification.
Here are the evaluation results on Uni-Med:
Method | medmnist_derma | medmnist_organs | medpix | mimic | pathvqa | samed_identify | samed_refer | slake_identify | slake_refer | slakevqa |
Med-Flamingo | 1.15 | 8.90 | 8.14 | 23.25 | 33.38 | - | - | - | - | 21.51 |
RadFM | 5.14 | 18.90 | - | 6.81 | 24.83 | - | - | - | - | 81.66 |
LLaVA-Med | 25.84 | 66.80 | 15.11 | 20.43 | 37.79 | 45.83 | 8.64 | 27.21 | 4.07 | 33.69 |
MedM-VL-2D-3B-en | 85.49 | 80.68 | 14.45 | 25.50 | 64.06 | 74.11 | 26.42 | 82.94 | 33.51 | 85.86 |
Quickstart
Please refer to MedM-VL.
- Downloads last month
- 8
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.