GemmaECG-Vision

GemmaECG-Vision is a fine-tuned vision-language model built on google/gemma-3n-e2b, designed for ECG image interpretation tasks. The model accepts a medical ECG image along with a clinical instruction prompt and generates a structured analysis suitable for triage or documentation use cases.

This model was developed using Unsloth for efficient fine-tuning and supports image + text inputs with medical task-specific prompt formatting. It is designed to run in offline or edge environments, enabling healthcare triage in resource-constrained settings.

Model Objective

To assist healthcare professionals and emergency responders by providing AI-generated ECG analysis directly from medical images, without requiring internet access or cloud resources.

Usage

This model expects:

  • An ECG image (PIL.Image)
  • A textual instruction such as:

You are a clinical assistant specialized in ECG interpretation. Given an ECG image, generate a concise, structured, and medically accurate report.

Use this exact format:

Rhythm:
PR Interval:
QRS Duration:
Axis:
Bundle Branch Blocks:
Atrial Abnormalities:
Ventricular Hypertrophy:
Q Wave or QS Complexes:
T Wave Abnormalities:
ST Segment Changes:
Final Impression:

Inference Example (Python)

from transformers import AutoProcessor, Gemma3nForConditionalGeneration
from PIL import Image
import torch

model_id = "yasserrmd/GemmaECG-Vision"
model = Gemma3nForConditionalGeneration.from_pretrained(model_id, torch_dtype=torch.bfloat16).eval().to("cuda")
processor = AutoProcessor.from_pretrained(model_id)

image = Image.open("example_ecg.png").convert("RGB")

messages = [
    {
        "role": "user",
        "content": [
            {"type": "image"},
            {"type": "text", "text": "Interpret this ECG and provide a structured triage report."}
        ]
    }
]

prompt = processor.apply_chat_template(messages, add_generation_prompt=True)

inputs = processor(image, prompt, return_tensors="pt").to("cuda")

outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    temperature=1.0,
    top_p=0.95,
    top_k=64,
    use_cache=True
)

result = processor.decode(outputs[0], skip_special_tokens=True)
print(result)

Training Details

  • Framework: Unsloth + TRL SFTTrainer
  • Hardware: Google Colab Pro (L4)
  • Batch Size: 2
  • Epochs: 1
  • Learning Rate: 2e-4
  • Scheduler: Cosine
  • Loss: CrossEntropy
  • Precision: bfloat16

Dataset

The training dataset is a curated subset of the PULSE-ECG/ECGInstruct dataset, reformatted for VLM instruction tuning.

  • 3,272 samples of ECG image + structured instruction + clinical output
  • Focused on realistic and medically relevant triage cases

Dataset link: yasserrmd/pulse-ecg-instruct-subset

Training Loss Summary

The model was fine-tuned over 409 steps using the pulse-ecg-instruct-subset dataset. The training loss started above 9.5 and steadily declined to below 0.5, showing consistent convergence and learning throughout the single epoch. The loss curve demonstrates a stable optimization process without overfitting spikes. The chart below visualizes this progression, highlighting the modelโ€™s ability to adapt quickly to the ECG image-to-text task.

Intended Use

  • Emergency triage in offline settings
  • On-device ECG assessment
  • Integration with medical edge devices (Jetson, Pi, Android)
  • Rapid analysis during disaster response

Limitations

  • Not intended to replace licensed medical professionals
  • Accuracy may vary depending on image quality
  • Model outputs should be reviewed by a clinician before action

License

This model is licensed under CC BY 4.0. You are free to use, modify, and distribute it with attribution.

Author

Mohamed Yasser Hugging Face Profile

Downloads last month
169
Safetensors
Model size
5.98B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for yasserrmd/GemmaECG-Vision

Finetuned
(24)
this model
Quantizations
1 model

Dataset used to train yasserrmd/GemmaECG-Vision

Space using yasserrmd/GemmaECG-Vision 1