BDHPD: Bilingual Dual-Head Architecture for Parkinson's Disease Detection from Speech
This model implements BDHPD, a deep neural network designed to detect Parkinson's Disease (PD) from speech signals, with bilingual support for Slovak and Spanish datasets.
Model Description
BDHPD combines several modern audio processing techniques:
- Self-supervised learning (SSL) with models like
microsoft/wavlm-base
- Wavelet-based spectrogram features
- Adaptive Instance Normalization (AdaIN) for domain adaptation
- Convolutional Bottleneck Layers for feature recalibration
- Dual-head classification architecture to handle different speech types (e.g., diadochokinetic and continuous)
- Contrastive learning for embedding space refinement
- Attention pooling for better sequence summarization
The architecture supports bilingual inputs and has been evaluated on EWA-DB (Slovak) and PC-GITA (Spanish).
Intended Use
- Research in pathological speech detection
- Benchmarking bilingual speech-based PD detection models
- Development of real-world diagnostic support tools in healthcare
Training
Training was performed using:
- AdamW optimizer
- Linear learning rate scheduling with warmup
- Binary cross-entropy loss for classification
- Contrastive loss via
pytorch-metric-learning
- 20 epochs with early stopping
- Balanced batch sampling for both datasets
How to Use
You can find all information on the GitHub repository: BDHPD GitHub
Datasets
Limitations
- The model is only trained on Slovak and Spanish speakers; cross-lingual generalization outside these languages is untested.
- Sensitive to audio quality-ensure audio is preprocessed with proper VAD and dereverberation.
- Should not be used as a standalone diagnostic tool.
Citation
If you use this model or find useful this research work, please cite the following paper:
@inproceedings{laquatra2025bilingual,
title={Bilingual Dual-Head Deep Model for Parkinson's Disease Detection from Speech},
author={La Quatra, Moreno and Orozco-Arroyave, Juan Rafael and Siniscalchi, Marco Sabato},
booktitle={ICASSP 2025 - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
year={2025}
}
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Evaluation results
- f1 on EWA-DB (Slovak)self-reported69.030
- accuracy on EWA-DB (Slovak)self-reported84.720
- sensitivity on EWA-DB (Slovak)self-reported56.520
- specificity on EWA-DB (Slovak)self-reported88.560
- f1 on PC-GITA (Spanish)self-reported90.830
- accuracy on PC-GITA (Spanish)self-reported90.830
- sensitivity on PC-GITA (Spanish)self-reported93.330
- specificity on PC-GITA (Spanish)self-reported88.330