Update README.md
Browse files
README.md
CHANGED
@@ -191,17 +191,7 @@ This model accepts 16000 KHz Mono-channel Audio (wav files) as input.
|
|
191 |
|
192 |
This model provides transcribed speech as a string for a given audio sample.
|
193 |
|
194 |
-
## NVIDIA Riva: Deployment
|
195 |
-
|
196 |
-
If you like this and other models from NVIDIA (i.e., CTC-based Conformers) check out [NVIDIA Riva](https://developer.nvidia.com/riva), an accelerated speech AI SDK deployable on-prem, in all clouds, multi-cloud, hybrid, on edge, and embedded. This model, as well as other RNNT-based models are currently not supported by Riva. You can find the list of models supported by Riva [here](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/reference/models/index.html).
|
197 |
|
198 |
-
Additionally, Riva provides:
|
199 |
-
|
200 |
-
* World-class out-of-the-box accuracy for the most common languages with model checkpoints trained on proprietary data with hundreds of thousands of GPU-compute hours
|
201 |
-
* Best in class accuracy via customization with run-time word boosting (e.g., brand and product names), acoustic model training, language model training, and inverse text normalization customizations
|
202 |
-
* Streaming speech recognition, Kubernetes compatible scaling, and Enterprise-grade support
|
203 |
-
|
204 |
-
Check out [Riva live demo](https://developer.nvidia.com/riva#demos).
|
205 |
|
206 |
## Model Architecture
|
207 |
|
@@ -242,6 +232,18 @@ The list of the available models in this collection is shown in the following ta
|
|
242 |
## Limitations
|
243 |
Since this model was trained on publicly available speech datasets, the performance of this model might degrade for speech which includes technical terms, or vernacular that the model has not been trained on. The model might also perform worse for accented speech.
|
244 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
245 |
## References
|
246 |
[1] [Conformer: Convolution-augmented Transformer for Speech Recognition](https://arxiv.org/abs/2005.08100)
|
247 |
[2] [Google Sentencepiece Tokenizer](https://github.com/google/sentencepiece)
|
|
|
191 |
|
192 |
This model provides transcribed speech as a string for a given audio sample.
|
193 |
|
|
|
|
|
|
|
194 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
195 |
|
196 |
## Model Architecture
|
197 |
|
|
|
232 |
## Limitations
|
233 |
Since this model was trained on publicly available speech datasets, the performance of this model might degrade for speech which includes technical terms, or vernacular that the model has not been trained on. The model might also perform worse for accented speech.
|
234 |
|
235 |
+
## NVIDIA Riva: Deployment
|
236 |
+
|
237 |
+
If you like this and other models from NVIDIA (i.e., CTC-based Conformers) check out [NVIDIA Riva](https://developer.nvidia.com/riva), an accelerated speech AI SDK deployable on-prem, in all clouds, multi-cloud, hybrid, on edge, and embedded. This model, as well as other RNNT-based models are currently not supported by Riva. You can find the list of models supported by Riva [here](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/reference/models/index.html).
|
238 |
+
|
239 |
+
Additionally, Riva provides:
|
240 |
+
|
241 |
+
* World-class out-of-the-box accuracy for the most common languages with model checkpoints trained on proprietary data with hundreds of thousands of GPU-compute hours
|
242 |
+
* Best in class accuracy via customization with run-time word boosting (e.g., brand and product names), acoustic model training, language model training, and inverse text normalization customizations
|
243 |
+
* Streaming speech recognition, Kubernetes compatible scaling, and Enterprise-grade support
|
244 |
+
|
245 |
+
Check out [Riva live demo](https://developer.nvidia.com/riva#demos).
|
246 |
+
|
247 |
## References
|
248 |
[1] [Conformer: Convolution-augmented Transformer for Speech Recognition](https://arxiv.org/abs/2005.08100)
|
249 |
[2] [Google Sentencepiece Tokenizer](https://github.com/google/sentencepiece)
|