Any publication?

by sappho192 - opened 2 days ago

2 days ago

Hi, thank you for releasing this model into public.

I'd like to study what changes were made in this 2.0 version compared to the previous model, but I couldn't find any papers related to this.
Is there any way I can find out in detail what has changed?

Thanks in advance.

naymaraq

NVIDIA org about 11 hours ago

The biggest diff in the training dataset, plus slightly different augmentations. The training data of 2.0 version includes non-speech audio samples to help the model distinguish between speech and non-speech sounds (such as coughing, laughter, and breathing, etc.)

You can refer to MarbleNet Paper: https://arxiv.org/pdf/2010.13886

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment