nvidia/parakeet-tdt-0.6b-v2 · how does the model handle timestamp decoding ?

Hugging Face

how does the model handle timestamp decoding ?

#50

by StephennFernandes - opened 7 days ago

Discussion

StephennFernandes

7 days ago

i wanted to know how does the model handle timestamp decoding, both the word level and segment level timestamps during inference.
are these timestamps trained as part of the ASR training similar to whisper, or is the output transcription and audio going through VAD and MFA to get the accurate timestamped transcriptions.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment