DiariZen
Collection
DiariZen is a speaker diarization toolkit driven by AudioZen and Pyannote 3.1.
•
4 items
•
Updated
This hub features the pre-trained model by DiariZen. The EEND component is built upon WavLM Large and Conformer layers. The model was trained on far-field, single-channel audio from a diverse set of public datasets, including AMI, AISHELL-4, AliMeeting, NOTSOFAR-1, MSDWild, DIHARD3, RAMC, and VoxConverse.
Then structured pruning at 80% sparsity is applied. After pruning, the number of parameters in WavLM Large is reduced from 316.6M to 63.3M, and the computational cost (MACs) decreases from 17.8G to 3.8G per second.
from diarizen.pipelines.inference import DiariZenPipeline
# load pre-trained model
diar_pipeline = DiariZenPipeline.from_pretrained("BUT-FIT/diarizen-wavlm-large-s80-md")
# apply diarization pipeline
diar_results = diar_pipeline('audio.wav')
# print results
for turn, _, speaker in diar_results.itertracks(yield_label=True):
print(f"start={turn.start:.1f}s stop={turn.end:.1f}s speaker_{speaker}")
# load pre-trained model and save RTTM result
diar_pipeline = DiariZenPipeline.from_pretrained(
"BUT-FIT/diarizen-wavlm-large-s80-md",
rttm_out_dir='.'
)
# apply diarization pipeline
diar_results = diar_pipeline('audio.wav', sess_name='session_name')
Dataset | Pyannote v3.1 | DiariZen |
---|---|---|
AMI | 22.4 | 14.0 |
AISHELL-4 | 12.2 | 9.8 |
AliMeeting | 24.4 | 12.5 |
NOTSOFAR-1 | - | 17.9 |
MSDWild | 25.3 | 15.6 |
DIHARD3 | 21.7 | 14.5 |
RAMC | 22.2 | 11.0 |
VoxConverse | 11.3 | 9.2 |
If you found this work helpful, please consider citing:
@inproceedings{han2025leveraging,
title={Leveraging self-supervised learning for speaker diarization},
author={Han, Jiangyu and Landini, Federico and Rohdin, Johan and Silnova, Anna and Diez, Mireia and Burget, Luk{\'a}{\v{s}}},
booktitle={Proc. ICASSP},
year={2025}
}
@article{han2025fine,
title={Fine-tune Before Structured Pruning: Towards Compact and Accurate Self-Supervised Models for Speaker Diarization},
author={Han, Jiangyu and Landini, Federico and Rohdin, Johan and Silnova, Anna and Diez, Mireia and Cernocky, Jan and Burget, Lukas},
journal={arXiv preprint arXiv:2505.24111},
year={2025}
}