GetmanY1 commited on
Commit
eacc6a2
verified
1 Parent(s): 49ec877

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -1
README.md CHANGED
@@ -20,7 +20,7 @@ The x-large model pre-trained on 16kHz sampled speech audio. When using the mode
20
 
21
  The Finnish Wav2Vec2 X-Large has the same architecture and uses the same training objective as the multilingual one described in [paper](https://www.isca-archive.org/interspeech_2022/babu22_interspeech.pdf). It is pre-trained on 158k hours of unlabeled Finnish speech, including [KAVI radio and television archive materials](https://kavi.fi/en/radio-ja-televisioarkistointia-vuodesta-2008/), Lahjoita puhetta (Donate Speech), Finnish Parliament, Finnish VoxPopuli.
22
 
23
- You can read more about the pre-trained model from [this paper](TODO). The training scripts are available on [GitHub](https://github.com/aalto-speech/large-scale-monolingual-speech-foundation-models).
24
 
25
  ## Intended uses & limitations
26
 
@@ -105,6 +105,22 @@ The pre-trained model was initialized with the following hyperparameters:
105
  - Pytorch 1.13.1+rocm5.2
106
  - Fairseq 0.12.2
107
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
108
  ## Team Members
109
 
110
  - Yaroslav Getman, [Hugging Face profile](https://huggingface.co/GetmanY1), [LinkedIn profile](https://www.linkedin.com/in/yaroslav-getman/)
 
20
 
21
  The Finnish Wav2Vec2 X-Large has the same architecture and uses the same training objective as the multilingual one described in [paper](https://www.isca-archive.org/interspeech_2022/babu22_interspeech.pdf). It is pre-trained on 158k hours of unlabeled Finnish speech, including [KAVI radio and television archive materials](https://kavi.fi/en/radio-ja-televisioarkistointia-vuodesta-2008/), Lahjoita puhetta (Donate Speech), Finnish Parliament, Finnish VoxPopuli.
22
 
23
+ You can read more about the pre-trained model from [this paper](https://www.isca-archive.org/interspeech_2025/getman25_interspeech.html). The training scripts are available on [GitHub](https://github.com/aalto-speech/large-scale-monolingual-speech-foundation-models).
24
 
25
  ## Intended uses & limitations
26
 
 
105
  - Pytorch 1.13.1+rocm5.2
106
  - Fairseq 0.12.2
107
 
108
+ ## Citation
109
+
110
+ If you use our models or scripts, please cite our article as:
111
+
112
+ ```bibtex
113
+ @inproceedings{getman25_interspeech,
114
+ title = {{Is your model big enough? Training and interpreting large-scale monolingual speech foundation models}},
115
+ author = {{Yaroslav Getman and Tam谩s Gr贸sz and Tommi Lehtonen and Mikko Kurimo}},
116
+ year = {{2025}},
117
+ booktitle = {{Interspeech 2025}},
118
+ pages = {{231--235}},
119
+ doi = {{10.21437/Interspeech.2025-46}},
120
+ issn = {{2958-1796}},
121
+ }
122
+ ```
123
+
124
  ## Team Members
125
 
126
  - Yaroslav Getman, [Hugging Face profile](https://huggingface.co/GetmanY1), [LinkedIn profile](https://www.linkedin.com/in/yaroslav-getman/)