How can I align timesteps to text for Parakeet-tdt-0.6b-v2 output using KenLM?
#48
by
Nguyen667201
- opened
Thanks to the NeMo team for the SOTA model! When I tried to run inference on the Parakeet ASR model with KenLM, I got an output like this:
Hypothesis(score=-4.529468536376953, y_sequence=tensor([1024, 223, 224, 5, 709, 50, 9, 172, 309, 64, 5, 168,
167, 840, 822, 239, 839, 5, 147, 840, 821, 59, 819, 862,
882, 15, 131, 55, 229, 131, 55, 39, 148, 4, 826, 30,
104, 326, 841]), text='because by the final square on the chessboard, the debt is 18 billion trillion grains of rice.', dec_out=None, dec_state=(tensor([[[ 4.2064e-02, 1.4051e-02, -5.5140e-01, ..., -2.0446e-01,
-1.0781e-04, -5.2153e-05]],
[[-1.3940e-02, 4.0917e-02, 5.9655e-02, ..., 8.7109e-02,
-4.7848e-03, 5.0823e-02]]]), tensor([[[ 0.9999, 0.0141, -0.6204, ..., -0.3344, -0.0409, -0.0473]],
[[-0.0511, 0.1249, 0.2977, ..., 1.0271, -2.0819, 0.1519]]])), timestep=[8, 15, 19, 23, 27, 59, 62, 65, 67, 69, 71, 74, 75, 77, 79, 81, 87, 88, 90, 93, 97, 99, 101, 104, 117, 118, 119, 125, 126, 128, 133, 135, 136, 138, 140, 144, 148, 156], alignments=None, frame_confidence=None, token_confidence=None, word_confidence=None, length=tensor(164, device='cuda:0'), y=None, lm_state=None, lm_scores=None, ngram_lm_state=None, tokens=None, last_token=None, token_duration=None, last_frame=164)
But i'dont know how can i get align text output with timestep?