Model Card for SHEET Models
This model card describes the models implemented in the SHEET toolkit trained using the training sets in MOS-Bench and benchmarked using the test sets in MOS-Bench.
The task is subjective speech quality assessment (SSQA), which aims to predict the perceptual quality score of speech.
Model Details
- Developed by: Wen-Chin Huang
- Model type: SSL-MOS or AlignNet
- License: MIT
- Repository: SHEET
- Paper: [SHEET] [MOS-Bench (arXiv; 2024)]
- Demo : https://huggingface.co/spaces/unilight/sheet-demo
Uses
Please refer to the README in the sheet repo for more details.
Bias, Risks, and Limitations
The models are not yet ready to be used to replace subjective tests in scientific papers. They can however be used to compare systems in a heterogeneous way.
How to Get Started with the Model
Please refer to the README in the sheet repo for more details.
Evaluation
Testing Data, Factors & Metrics
Testing Data
Please refer to the egs
folder in the sheet repo for more details.
Metrics
Commonly used metrics for SQA are MSE, LCC, SRCC and KTAU. A code snippet for calculating them can be found here: https://gist.github.com/unilight/883726c94640cca1f4d4068e29c3d20f
Please refer to the MOS-Bench (arXiv; 2024) paper for details.
Results
Please refer to the MOS-Bench (arXiv; 2024) paper for details.
Citation
BibTeX:
@inproceedings{sheet,
title = {{SHEET: A Multi-purpose Open-source Speech Human Evaluation Estimation Toolkit}},
author = {Wen-Chin Huang and Erica Cooper and Tomoki Toda},
year = {2025},
booktitle = {{Proc. Interspeech}},
pages = {2355--2359},
}
@article{huang2024,
title={MOS-Bench: Benchmarking Generalization Abilities of Subjective Speech Quality Assessment Models},
author={Wen-Chin Huang and Erica Cooper and Tomoki Toda},
year={2024},
eprint={2411.03715},
archivePrefix={arXiv},
primaryClass={cs.SD},
url={https://arxiv.org/abs/2411.03715},
}
Model Card Contact
Wen-Chin Huang
Nagoya University
Email: [email protected]
GitHub: unilight