GetmanY1
/

wav2vec2-large-fi-150k

Automatic Speech Recognition

Model card Files Files and versions

GetmanY1 commited on Sep 15

Commit

0cc83f3

·

verified ·

1 Parent(s): 7e764cd

Update README.md

Files changed (1) hide show

README.md +17 -1

README.md CHANGED Viewed

@@ -20,7 +20,7 @@ The large model pre-trained on 16kHz sampled speech audio. When using the model
 The Finnish Wav2Vec2 Large has the same architecture and uses the same training objective as the English and multilingual one described in [Paper](https://arxiv.org/abs/2006.11477). It is pre-trained on 158k hours of unlabeled Finnish speech, including [KAVI radio and television archive materials](https://kavi.fi/en/radio-ja-televisioarkistointia-vuodesta-2008/), Lahjoita puhetta (Donate Speech), Finnish Parliament, Finnish VoxPopuli.
-You can read more about the pre-trained model from [this paper](TODO). The training scripts are available on [GitHub](https://github.com/aalto-speech/large-scale-monolingual-speech-foundation-models).
 ## Intended uses & limitations
@@ -105,6 +105,22 @@ The pre-trained model was initialized with the following hyperparameters:
 - Pytorch 1.13.1+rocm5.2
 - Fairseq 0.12.2
 ## Team Members
 - Yaroslav Getman, [Hugging Face profile](https://huggingface.co/GetmanY1), [LinkedIn profile](https://www.linkedin.com/in/yaroslav-getman/)

 The Finnish Wav2Vec2 Large has the same architecture and uses the same training objective as the English and multilingual one described in [Paper](https://arxiv.org/abs/2006.11477). It is pre-trained on 158k hours of unlabeled Finnish speech, including [KAVI radio and television archive materials](https://kavi.fi/en/radio-ja-televisioarkistointia-vuodesta-2008/), Lahjoita puhetta (Donate Speech), Finnish Parliament, Finnish VoxPopuli.
+You can read more about the pre-trained model from [this paper](https://www.isca-archive.org/interspeech_2025/getman25_interspeech.html). The training scripts are available on [GitHub](https://github.com/aalto-speech/large-scale-monolingual-speech-foundation-models).
 ## Intended uses & limitations
 - Pytorch 1.13.1+rocm5.2
 - Fairseq 0.12.2
+## Citation
+If you use our models or scripts, please cite our article as:
+```bibtex
+@inproceedings{getman25_interspeech,
+  title     = {{Is your model big enough? Training and interpreting large-scale monolingual speech foundation models}},
+  author    = {{Yaroslav Getman and Tamás Grósz and Tommi Lehtonen and Mikko Kurimo}},
+  year      = {{2025}},
+  booktitle = {{Interspeech 2025}},
+  pages     = {{231--235}},
+  doi       = {{10.21437/Interspeech.2025-46}},
+  issn      = {{2958-1796}},
+}
+```
 ## Team Members
 - Yaroslav Getman, [Hugging Face profile](https://huggingface.co/GetmanY1), [LinkedIn profile](https://www.linkedin.com/in/yaroslav-getman/)