Overview

This hub features the pre-trained model by DiariZen. The EEND component is built upon WavLM Base+ and Conformer layers. The model was trained on far-field, single-channel audio from a diverse set of public datasets, including AMI, AISHELL-4, AliMeeting, NOTSOFAR-1, MSDWild, DIHARD3, RAMC, and VoxConverse.

Then structured pruning at 80% sparsity is applied. After pruning, the number of parameters in WavLM Base+ is reduced from 94.4M to 18.8M, and the computational cost (MACs) decreases from 6.9G to 1.1G per second.

Usage

from diarizen.pipelines.inference import DiariZenPipeline

# load pre-trained model
diar_pipeline = DiariZenPipeline.from_pretrained("BUT-FIT/diarizen-wavlm-base-s80-md")
# apply diarization pipeline
diar_results = diar_pipeline('audio.wav')

# print results
for turn, _, speaker in diar_results.itertracks(yield_label=True):
    print(f"start={turn.start:.1f}s stop={turn.end:.1f}s speaker_{speaker}")

# load pre-trained model and save RTTM result
diar_pipeline = DiariZenPipeline.from_pretrained(
        "BUT-FIT/diarizen-wavlm-base-s80-md",
        rttm_out_dir='.'
)
# apply diarization pipeline
diar_results = diar_pipeline('audio.wav', sess_name='session_name')

Results (collar=0s)

Dataset Pyannote v3.1 DiariZen
AMI 22.4 15.8
AISHELL-4 12.2 10.7
AliMeeting 24.4 14.1
NOTSOFAR-1 - 20.3
MSDWild 25.3 17.4
DIHARD3 21.7 15.9
RAMC 22.2 11.4
VoxConverse 11.3 9.7

Citation

If you found this work helpful, please consider citing:

@inproceedings{han2025leveraging,
  title={Leveraging self-supervised learning for speaker diarization},
  author={Han, Jiangyu and Landini, Federico and Rohdin, Johan and Silnova, Anna and Diez, Mireia and Burget, Luk{\'a}{\v{s}}},
  booktitle={Proc. ICASSP},
  year={2025}
}

@article{han2025fine,
  title={Fine-tune Before Structured Pruning: Towards Compact and Accurate Self-Supervised Models for Speaker Diarization},
  author={Han, Jiangyu and Landini, Federico and Rohdin, Johan and Silnova, Anna and Diez, Mireia and Cernocky, Jan and Burget, Lukas},
  journal={arXiv preprint arXiv:2505.24111},
  year={2025}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including BUT-FIT/diarizen-wavlm-base-s80-md