CoRal-project
/

roest-wav2vec2-1B-v2

Automatic Speech Recognition

Model card Files Files and versions Community

MarieAlvenir commited on 15 days ago

Commit

57e1a98

·

1 Parent(s): 818e79f

Path to pretrained model

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -58,7 +58,7 @@ Next you can use the model using the `transformers` Python package as follows:
 ## Model Details
-Wav2Vec2 is a state-of-the-art model architecture for speech recognition, leveraging self-supervised learning from raw audio data. The pre-trained [wav2vec2-xls-r-1b](facebook/wav2vec2-xls-r-1b) has been fine-tuned for automatic speech recognition with the [CoRal-v2 dataset](https://huggingface.co/datasets/CoRal-project/coral-v2/tree/main) dataset to enhance its performance in recognizing Danish speech with consideration to different dialects. The model was trained for 30K steps using the training setup in the [CoRaL repository](https://github.com/alexandrainst/coral/tree) by running:
 ```
 python src/scripts/finetune_asr_model.py model=wav2vec2-medium max_steps=30000 datasets.coral_conversation_internal.id=CoRal-project/coral-v2 datasets.coral_readaloud_internal.id=CoRal-project/coral-v2
 ```
@@ -208,7 +208,7 @@ We would like specifically to thank Dan Saattrup Nielsen, Alexandra Institute fo
 ## Citation
 ```bibtex
 @misc{roest-wav2vec2-1B-v2,
-  author    = {Marie Juhl Jørgensen, Søren Vejlgaard Holm, Martin Carsten Nielsen, Dan Saattrup Nielsen, Sif Bernstorff Lehmann and Simon Leminen Madsen},
   title     = {Roest-wav2vec-1B-v2: A Danish state-of-the-art speech recognition model trained on varied demographics and dialects},
   year      = {2025},
   url       = {https://huggingface.co/CoRal-project/roest-wav2vec2-1B-v2},

 ## Model Details
+Wav2Vec2 is a state-of-the-art model architecture for speech recognition, leveraging self-supervised learning from raw audio data. The pre-trained [wav2vec2-xls-r-1b](https://huggingface.co/facebook/wav2vec2-xls-r-1b) has been fine-tuned for automatic speech recognition with the [CoRal-v2 dataset](https://huggingface.co/datasets/CoRal-project/coral-v2/tree/main) dataset to enhance its performance in recognizing Danish speech with consideration to different dialects. The model was trained for 30K steps using the training setup in the [CoRaL repository](https://github.com/alexandrainst/coral/tree) by running:
 ```
 python src/scripts/finetune_asr_model.py model=wav2vec2-medium max_steps=30000 datasets.coral_conversation_internal.id=CoRal-project/coral-v2 datasets.coral_readaloud_internal.id=CoRal-project/coral-v2
 ```
 ## Citation
 ```bibtex
 @misc{roest-wav2vec2-1B-v2,
+  author    = {Marie Juhl Jørgensen, Søren Vejlgaard Holm, Martin Carsten Nielsen, Dan Saattrup Nielsen, Sif Bernstorff Lehmann, Simon Leminen Madsen and Torben Blach},
   title     = {Roest-wav2vec-1B-v2: A Danish state-of-the-art speech recognition model trained on varied demographics and dialects},
   year      = {2025},
   url       = {https://huggingface.co/CoRal-project/roest-wav2vec2-1B-v2},