Model Details

Model Description

This model is a fine-tuned version of cambridgeltl/SapBERT-from-PubMedBERT-fulltext on the DDXPlus dataset (10,000 samples) for medical diagnosis tasks.

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

  • Developed by: Aashish Acharya
  • Model type: sapBERT-BioMedBERT
  • Language(s): English
  • License: MIT
  • Finetuned from model: cambridgeltl/SapBERT-from-PubMedBERT-fulltext

Model Sources

Training Dataset

The model was trained on DDXPlus dataset (10,000 samples) containing:

  • Patient cases with comprehensive medical information
  • Differential diagnosis annotations
  • 49 distinct medical conditions
  • Evidence-based symptom-condition relationships

Performance

Final Metrics

  • Test Precision: 0.9619
  • Test Recall: 0.9610
  • Test F1 Score: 0.9592

Training Evolution

  • Best Validation F1: 0.9728 (Epoch 4)
  • Final Validation Loss: 0.6352
image image

Intended Use

This model is designed for:

  • Medical diagnosis support
  • Symptom analysis
  • Disease classification
  • Differential diagnosis generation

Out-of-Scope Use

The model should NOT be used for:

  • Direct medical diagnosis without professional oversight
  • Critical healthcare decisions without human validation
  • Clinical applications without proper testing and validation

Training Details

Training Procedure

  • Optimizer: AdamW with weight decay (0.01)
  • Learning Rate: 1e-5
  • Loss Function: Combined loss (0.8 × Focal Loss + 0.2 × KL Divergence)
  • Batch Size: 32
  • Gradient Clipping: 1.0
  • Early Stopping: Patience of 3 epochs
  • Training Strategy: Cross-validation with 5 folds

Model Architecture

  • Base Model: cambridgeltl/SapBERT-from-PubMedBERT-fulltext
  • Hidden Size: 768
  • Attention Heads: 12
  • Dropout Rate: 0.5
  • Added classification layers for diagnostic tasks
  • Layer normalization and dropout for regularization

Example Usage

  from transformers import AutoTokenizer, AutoModel

# Load model and tokenizer
model_name = "aashish99/sapbert-pubmedbert-ddxplus-10k"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)

# Example input structure
input_data = {
    'age': 45,  # Patient age
    'sex': 'M',  # Patient sex: 'M' or 'F'
    'initial_evidence': 'E_91',  # Initial evidence code (e.g., E_91 for fever)
    'evidences': [
        'E_91',  # Fever
        'E_77',  # Cough
        'E_89'   # Fatigue
    ]
}

# Process demographic data and evidence codes
outputs = model(**input_data)

# Outputs will include:
# - Main diagnosis prediction
# - Differential diagnosis probabilities
# - Confidence scores

Note: Evidence codes (E_XX) correspond to specific symptoms and conditions defined in the release_evidences.json file. The model expects these standardized codes rather than raw text input.

Citation

  @misc{acharya2024sapbert,
  title={SapBERT-PubMedBERT Fine-tuned on DDXPlus Dataset},
  author={Acharya, Aashish},
  year={2024},
  publisher={Hugging Face Model Hub}
}

Model Card Contact

Aashish Acharya

Downloads last month
2
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for acharya-jyu/sapbert-pubmedbert-ddxplus-10k

Finetuned
(1)
this model

Dataset used to train acharya-jyu/sapbert-pubmedbert-ddxplus-10k