saattrupdan
/

wav2vec2-xls-r-300m-ftspeech

Automatic Speech Recognition

Model card Files Files and versions

XLS-R-300m-FTSpeech

Model description

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the FTSpeech dataset, being a dataset of 1,800 hours of transcribed speeches from the Danish parliament.

Performance

The model achieves the following WER scores (lower is better):

Dataset	WER without LM	WER with 5-gram LM
Danish part of Common Voice 8.0	20.48	17.91
Alvenir test set	15.46	13.84

License

The use of this model needs to adhere to this license from the Danish Parliament.

Downloads last month: 72,142

Safetensors

Model size

315M params

Tensor type

F32

·

Inference Providers NEW

Automatic Speech Recognition

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for saattrupdan/wav2vec2-xls-r-300m-ftspeech

Base model

facebook/wav2vec2-xls-r-300m

Finetuned

(714)

this model

Space using saattrupdan/wav2vec2-xls-r-300m-ftspeech 1

Evaluation results

wer on Danish Common Voice 8.0
self-reported

17.910
wer on Alvenir ASR test dataset
self-reported

13.840

View on Papers With Code