Feature Extraction
NeMo
CasanovaE commited on
Commit
5c8e22e
·
verified ·
1 Parent(s): 8594dbf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -0
README.md CHANGED
@@ -40,6 +40,13 @@ Model | Sample Rate | Frame Rate | Bit Rate | # Codebooks | Codebook Size | Em
40
  [0.6kbps-12.5fps](https://huggingface.co/nvidia/nemo-nano-codec-22khz-0.6kbps-12.5fps) | 22050 | 12.5 | 0.6kpbs | 4 | 4032 | 16 | [9, 8, 8, 7] |
41
  [1.89kbps-21.5fps](https://huggingface.co/nvidia/nemo-nano-codec-22khz-1.89kbps-21.5fps) | 22050 | 21.5 | 1.89kpbs | 8 | 2016 | 32 | [8, 7, 6, 6] |
42
 
 
 
 
 
 
 
 
43
 
44
  ## License/Terms of Use
45
  [NVIDIA Open Model License Agreement](https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf)
 
40
  [0.6kbps-12.5fps](https://huggingface.co/nvidia/nemo-nano-codec-22khz-0.6kbps-12.5fps) | 22050 | 12.5 | 0.6kpbs | 4 | 4032 | 16 | [9, 8, 8, 7] |
41
  [1.89kbps-21.5fps](https://huggingface.co/nvidia/nemo-nano-codec-22khz-1.89kbps-21.5fps) | 22050 | 21.5 | 1.89kpbs | 8 | 2016 | 32 | [8, 7, 6, 6] |
42
 
43
+ ⚠️ **Note on 0.6kbps-12.5fps**
44
+ This variant is designed for **fine-tuning with a limited set of speakers**, as shown in our [S2S Duplex paper](https://www.isca-archive.org/interspeech_2025/hu25f_interspeech.html).
45
+ It is **not recommended** for general-purpose audio encoding or decoding.
46
+
47
+ ℹ️ **Recommended Variants**
48
+ Both **1.78kbps-12.5fps** and **1.89kbps-21.5fps** achieve similar audio reconstruction quality.
49
+ However, our [Magpie TTS](https://build.nvidia.com/nvidia/magpie-tts-multilingual) model performs best with **1.89kbps-21.5fps**.
50
 
51
  ## License/Terms of Use
52
  [NVIDIA Open Model License Agreement](https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf)