wav2vec2-large-xls-r-300m-cv8-nl
This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the common_voice dataset. In addition a 6gram KenLM model was trained and used. The KenLM model was based on train+validation Common Voice 8 It achieves results depicted on the rigth side on the model card (testset CV8)
Model description
Dutch wav2vec2-xls-r-300m model using Common Voice 8 dataset
Intended uses & limitations
More information needed
Training and evaluation data
The model was trained on Dutch common voice 8 with 75 epochs. The train set consisted of the common voice 8 train set and evaluation set was the common voice 8 validation set. The WER reported is on the common voice 8 test set which was not part of training nor validation (eval)
Training procedure
Training hyperparameters
Framework versions
- Transformers 4.16.0.dev0
- Pytorch 1.10.1+cu102
- Datasets 1.18.1
- Tokenizers 0.11.0
- Downloads last month
- 7
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Dataset used to train RuudVelo/wav2vec2-large-xls-r-300m-cv8-nl
Evaluation results
- Test WER on Common Voice 8self-reported14.530
- Test CER on Common Voice 8self-reported4.700
- Test WER on Robust Speech Event - Dev Dataself-reported33.700
- Test CER on Robust Speech Event - Dev Dataself-reported15.640
- Test WER on Robust Speech Event - Test Dataself-reported35.190