Text Classification
Safetensors
English
bert

BioBERT Symptom Text Classifier 🧬🩺

This model is a fine-tuned version of dmis-lab/biobert-base-cased-v1.1 on a symptom-to-condition classification task. It maps free-form medical symptom descriptions in English to 25 predefined symptom categories such as "back pain", "headache", "injury from sports", etc.

🧠 Model Details

  • Architecture: BioBERT (Transformer-based)
  • Base Model: dmis-lab/biobert-base-cased-v1.1
  • Task: Text Classification (Single-label)
  • Labels: 25 symptom categories (see full list below)
  • Language: English
  • License: Apache 2.0

πŸ“Š Datasets Used

This model was trained on a combination of public datasets containing free-text symptom descriptions annotated with associated pain types or complaints:

🏷️ Label Set (25 Classes)

The model predicts one of the following 25 labels:

ID Symptom Category
0 emotional pain
1 hair falling out
2 heart hurts
3 infected wound
4 foot ache
5 shoulder pain
6 injury from sports
7 skin issue
8 stomach ache
9 knee pain
10 joint pain
11 hard to breath
12 head ache
13 body feels weak
14 feeling dizzy
15 back pain
16 open wound
17 internal pain
18 blurry vision
19 acne
20 muscle pain
21 neck pain
22 cough
23 ear ache
24 feeling cold

πŸš€ Usage

To use the model in your project:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "your-username/your-model-name"  # Replace with actual path

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
model.eval()

def classify_symptom(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
    with torch.no_grad():
        outputs = model(**inputs)
        predicted_class_id = torch.argmax(outputs.logits, dim=-1).item()
        label = model.config.id2label[predicted_class_id]
    return label

# Example
classify_symptom("My lower back hurts when I sit for a long time")
# ➜ "back pain"
Downloads last month
13
Safetensors
Model size
108M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for OzzeY72/biobert-symptom2disease

Finetuned
(24)
this model

Datasets used to train OzzeY72/biobert-symptom2disease