LlaVA-Rad

LlaVA-Rad is a 7 billion parameter small multimodal model trained to produce findings given an input chest X-ray. Its architecture follows that of LLaVA and LLaVA-Med, differing in the use of a specialized chest X-ray image encoder, BiomedCLIP-CXR, built with the BiomedCLIP framework. LLaVA-Rad offers outstanding performance at relatively small model size.

πŸ“Œ Note: For original model weights, refer to microsoft/llava-rad.

πŸ“ƒ Original paper: Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation.


πŸ”¬ Experimental Usage in Libra's repo

This model checkpoint is intended for experimental use and can be tested directly within the Libra repository.

For better benchmarking, we recommend using the official test set from X-iZhang/MIMIC-CXR-RRG.

Use-case:

# πŸ§ͺ Inference example following the official LlaVA-Rad setup
from libra.eval import libra_eval

image_file = "https://openi.nlm.nih.gov/imgs/512/253/253/CXR253_IM-1045-1001.png"
model_path = "X-iZhang/libra-llava-rad"

answer = libra_eval(
    model_path=model_path,
    image_file=image_file,
    query="Describe the findings of the chest x-ray.\n",
    conv_mode="v1",          # Use default version
    temperature=0.0,         # Use greedy decoding
    max_new_tokens=1024,
)

# βœ… Expected output
print(answer)
# > Frontal and lateral chest radiographs demonstrate a moderate left
# > pneumothorax.  The right lung is clear.  The cardiomediastinal and hilar
# > contours are normal.

  • LLaVA-Rad outputs are formatted as structured report text with line breaks (\n) intentionally preserved.
  • When performing automatic evaluation (e.g., ROUGE, BLEU, RadGraph), make sure to normalise or flatten the text if required.

πŸ“š Learn More

For a deeper dive into the methodology, theoretical insights, and performance benchmarks of the Libra framework, please see the following resources:


Disclaimer

This implementation is intended strictly for research and benchmarking purposes. It is not validated for clinical use, and any application in real-world diagnosis or treatment is strongly discouraged.

If any use case is found to violate these intended purposes (e.g., clinical deployment, misleading medical claims), the maintainers reserve the right to remove related code, models, or access permissions without prior notice.

License

MSRLA license.


Downloads last month
28
Safetensors
Model size
6.76B params
Tensor type
F16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for X-iZhang/libra-llava-rad

Collection including X-iZhang/libra-llava-rad