llm2vec4cxr / README.md

lukeingawesome

Upload LLM2Vec4CXR fine-tuned model

71d3180 verified about 1 month ago

preview code

raw

history blame

6.53 kB

metadata

license: mit
base_model: microsoft/LLM2CLIP-Llama-3.2-1B-Instruct-CC-Finetuned
tags:
  - text-embeddings
  - sentence-transformers
  - llm2vec
  - medical
  - chest-xray
  - radiology
  - clinical-nlp
language:
  - en
pipeline_tag: feature-extraction
library_name: transformers

LLM2Vec4CXR - Fine-tuned Model for Chest X-ray Report Analysis

This model is a fine-tuned version of microsoft/LLM2CLIP-Llama-3.2-1B-Instruct-CC-Finetuned specifically optimized for chest X-ray report analysis and medical text understanding.

Model Description

LLM2Vec4CXR is a bidirectional language model that converts the base decoder-only LLM into a text encoder optimized for medical text embeddings. The model has been fully fine-tuned with modified pooling strategy (latent_attention) to better capture semantic relationships in chest X-ray reports.

Key Features

Base Architecture: LLM2CLIP-Llama-3.2-1B-Instruct
Pooling Mode: Latent Attention (modified from original)
Bidirectional Processing: Enabled for better context understanding
Medical Domain: Specialized for chest X-ray report analysis
Max Length: 512 tokens
Precision: bfloat16

Training Details

Training Data

Fully fine-tuned on chest X-ray reports and medical text data
Training focused on understanding pleural effusion status and other chest X-ray findings

Training Configuration

Pooling Mode: latent_attention (modified from base model)
Enable Bidirectional: True
Max Length: 512
Torch Dtype: bfloat16
Full Fine-tuning: All model weights were updated during training

Usage

Installation

pip install torch transformers
# Also requires the LLM2Vec wrapper - see the original repository for installation

Basic Usage

import torch
import torch.nn.functional as F
from llm2vec_wrapper import LLM2VecWrapper as LLM2Vec

# Load the model
model = LLM2Vec.from_pretrained(
    base_model_name_or_path='lukeingawesome/llm2vec4cxr',
    enable_bidirectional=True,
    pooling_mode="latent_attention",
    max_length=512,
    torch_dtype=torch.bfloat16,
)

# Configure tokenizer
tokenizer = model.tokenizer
tokenizer.padding_side = 'left'

# Example usage for chest X-ray report analysis
def encode_text(text):
    inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=512)
    with torch.no_grad():
        embeddings = model(inputs)
    return embeddings

# Example with medical text
report = "There is a small increase in the left-sided effusion. There continues to be volume loss at both bases."
embedding = encode_text(report)

Advanced Usage with Separator-based Processing

The model supports special separator-based processing for instruction-following tasks:

def tokenize_with_separator(texts, tokenizer, max_length):
    """Tokenize texts with special handling for separator-based splitting."""
    texts_2 = []
    original_texts = []
    separator = '!@#$%^&*()'
    
    for text in texts:
        parts = text.split(separator)
        texts_2.append(parts[1] if len(parts) > 1 else "")
        original_texts.append("".join(parts))

    tokenized = tokenizer(
        original_texts,
        return_tensors="pt",
        padding=True,
        truncation=True,
        max_length=max_length,
    )
    
    # Create embedding masks for the separated parts
    embed_mask = None
    for t_i, t in enumerate(texts_2):
        ids = tokenizer([t], return_tensors="pt", padding=True, truncation=True, 
                       max_length=max_length, add_special_tokens=False)
        e_m = torch.zeros_like(tokenized["attention_mask"][t_i])
        if len(ids["input_ids"][0]) > 0:
            e_m[-len(ids["input_ids"][0]):] = torch.ones(len(ids["input_ids"][0]))
        if embed_mask is None:
            embed_mask = e_m.unsqueeze(0)
        else:
            embed_mask = torch.cat((embed_mask, e_m.unsqueeze(0)), dim=0)

    tokenized["embed_mask"] = embed_mask
    return tokenized

# Example with instruction and report
separator = '!@#$%^&*()'
instruction = 'Determine the change or the status of the pleural effusion.'
report = 'There is a small increase in the left-sided effusion.'
text = instruction + separator + report

tokenized = tokenize_with_separator([text], tokenizer, 512)
embedding = model(tokenized)

Evaluation

The model has been evaluated on chest X-ray report analysis tasks, particularly for:

Pleural effusion status determination
Medical text similarity comparison
Clinical finding extraction

Sample Performance

The model shows improved performance compared to the base model on medical text understanding tasks, particularly in distinguishing between different pleural effusion states and medical abbreviations.

Intended Use

Primary Use Cases

Medical Text Embeddings: Generate embeddings for chest X-ray reports
Clinical Text Similarity: Compare medical texts for semantic similarity
Medical Information Retrieval: Find relevant medical reports or findings
Clinical NLP Research: Foundation model for medical text analysis

Limitations

Specialized for chest X-ray reports - may not generalize to other medical domains
Requires careful preprocessing for optimal performance
Should be used as part of a larger clinical decision support system, not for standalone diagnosis

Technical Specifications

Model Type: Bidirectional Language Model (LLM2Vec)
Architecture: LlamaBiModel (modified Llama 3.2)
Parameters: ~1B parameters
Input Length: Up to 512 tokens
Output: Dense embeddings
Precision: bfloat16

Citation

If you use this model in your research, please cite:

@misc{llm2vec4cxr,
  title={LLM2Vec4CXR: Fine-tuned Language Model for Chest X-ray Report Analysis},
  author={[Your Name]},
  year={2024},
  howpublished={\\url{https://huggingface.co/lukeingawesome/llm2vec4cxr}},
}

Acknowledgments

This model is built upon:

LLM2Vec - Framework for converting decoder-only LLMs into text encoders
LLM2CLIP - Microsoft's implementation for connecting LLMs with CLIP models
microsoft/LLM2CLIP-Llama-3.2-1B-Instruct-CC-Finetuned - Base model

License

This model is licensed under the MIT License.