DiariZen
Collection
DiariZen is a speaker diarization toolkit driven by AudioZen and Pyannote 3.1.
•
4 items
•
Updated
This hub features the pre-trained model by DiariZen. The EEND component is built upon WavLM Base+ and Conformer layers. The model was trained on far-field, single-channel audio from a diverse set of public datasets, including AMI, AISHELL-4, AliMeeting, NOTSOFAR-1, MSDWild, DIHARD3, RAMC, and VoxConverse.
Then structured pruning at 80% sparsity is applied. After pruning, the number of parameters in WavLM Base+ is reduced from 94.4M to 18.8M, and the computational cost (MACs) decreases from 6.9G to 1.1G per second.
from diarizen.pipelines.inference import DiariZenPipeline
# load pre-trained model
diar_pipeline = DiariZenPipeline.from_pretrained("BUT-FIT/diarizen-wavlm-base-s80-md")
# apply diarization pipeline
diar_results = diar_pipeline('audio.wav')
# print results
for turn, _, speaker in diar_results.itertracks(yield_label=True):
print(f"start={turn.start:.1f}s stop={turn.end:.1f}s speaker_{speaker}")
# load pre-trained model and save RTTM result
diar_pipeline = DiariZenPipeline.from_pretrained(
"BUT-FIT/diarizen-wavlm-base-s80-md",
rttm_out_dir='.'
)
# apply diarization pipeline
diar_results = diar_pipeline('audio.wav', sess_name='session_name')
Dataset | Pyannote v3.1 | DiariZen |
---|---|---|
AMI | 22.4 | 15.8 |
AISHELL-4 | 12.2 | 10.7 |
AliMeeting | 24.4 | 14.1 |
NOTSOFAR-1 | - | 20.3 |
MSDWild | 25.3 | 17.4 |
DIHARD3 | 21.7 | 15.9 |
RAMC | 22.2 | 11.4 |
VoxConverse | 11.3 | 9.7 |
If you found this work helpful, please consider citing:
@inproceedings{han2025leveraging,
title={Leveraging self-supervised learning for speaker diarization},
author={Han, Jiangyu and Landini, Federico and Rohdin, Johan and Silnova, Anna and Diez, Mireia and Burget, Luk{\'a}{\v{s}}},
booktitle={Proc. ICASSP},
year={2025}
}
@article{han2025fine,
title={Fine-tune Before Structured Pruning: Towards Compact and Accurate Self-Supervised Models for Speaker Diarization},
author={Han, Jiangyu and Landini, Federico and Rohdin, Johan and Silnova, Anna and Diez, Mireia and Cernocky, Jan and Burget, Lukas},
journal={arXiv preprint arXiv:2505.24111},
year={2025}
}