Update README.md
#17
by
steveheh
- opened
README.md
CHANGED
@@ -276,7 +276,7 @@ NVIDIA [NeMo Canary](https://nvidia.github.io/NeMo/blogs/2024/2024-02-canary/) i
|
|
276 |
|
277 |
Canary is an encoder-decoder model with FastConformer [1] encoder and Transformer Decoder [2].
|
278 |
With audio features extracted from the encoder, task tokens such as `<source language>`, `<target language>`, `<task>` and `<toggle PnC>`
|
279 |
-
are fed into the Transformer Decoder to trigger the text generation process. Canary uses a concatenated tokenizer from individual
|
280 |
SentencePiece [3] tokenizers of each language, which makes it easy to scale up to more languages.
|
281 |
The Canay-1B model has 24 encoder layers and 24 layers of decoder layers in total.
|
282 |
|
@@ -479,7 +479,7 @@ BLEU score on [FLEURS](https://huggingface.co/datasets/google/fleurs) test set:
|
|
479 |
|
480 |
| **Version** | **Model** | **En->De** | **En->Es** | **En->Fr** | **De->En** | **Es->En** | **Fr->En** |
|
481 |
|:-----------:|:---------:|:----------:|:----------:|:----------:|:----------:|:----------:|:----------:|
|
482 |
-
| 1.23.0 | canary-1b |
|
483 |
|
484 |
|
485 |
BLEU score on [COVOST-v2](https://github.com/facebookresearch/covost) test set:
|
@@ -518,6 +518,7 @@ Check out [Riva live demo](https://developer.nvidia.com/riva#demos).
|
|
518 |
|
519 |
[4] [NVIDIA NeMo Toolkit](https://github.com/NVIDIA/NeMo)
|
520 |
|
|
|
521 |
|
522 |
## Licence
|
523 |
|
|
|
276 |
|
277 |
Canary is an encoder-decoder model with FastConformer [1] encoder and Transformer Decoder [2].
|
278 |
With audio features extracted from the encoder, task tokens such as `<source language>`, `<target language>`, `<task>` and `<toggle PnC>`
|
279 |
+
are fed into the Transformer Decoder to trigger the text generation process. Canary uses a concatenated tokenizer [5] from individual
|
280 |
SentencePiece [3] tokenizers of each language, which makes it easy to scale up to more languages.
|
281 |
The Canay-1B model has 24 encoder layers and 24 layers of decoder layers in total.
|
282 |
|
|
|
479 |
|
480 |
| **Version** | **Model** | **En->De** | **En->Es** | **En->Fr** | **De->En** | **Es->En** | **Fr->En** |
|
481 |
|:-----------:|:---------:|:----------:|:----------:|:----------:|:----------:|:----------:|:----------:|
|
482 |
+
| 1.23.0 | canary-1b | 32.13 | 22.66 | 40.76 | 33.98 | 21.80 | 30.95 |
|
483 |
|
484 |
|
485 |
BLEU score on [COVOST-v2](https://github.com/facebookresearch/covost) test set:
|
|
|
518 |
|
519 |
[4] [NVIDIA NeMo Toolkit](https://github.com/NVIDIA/NeMo)
|
520 |
|
521 |
+
[5] [Unified Model for Code-Switching Speech Recognition and Language Identification Based on Concatenated Tokenizer](https://aclanthology.org/2023.calcs-1.7.pdf)
|
522 |
|
523 |
## Licence
|
524 |
|