Is it limited to producing a single track?
#21
by
BigDeeper
- opened
Not clear from the card, if it is possible to produce separate tracks for different speakers, with appropriate silences to allow "others" to speak?
Currently this isn’t supported. All speakers are rendered into a single audio track, rather than separate tracks.
Yes, all voices are currently mixed into a single track. If you’d like to separate them, we recommend using post-processing techniques such as VAD and diarization to manually split the generated audio.
Currently this isn’t supported. All speakers are rendered into a single audio track, rather than separate tracks.
I noted that Layer 0/1 dimension 609 seems to be the relevant activation.