Smart Turn v3

Smart Turn v3 is an open‑source semantic Voice Activity Detection (VAD) model that tells you whether a speaker has finished their turn by analysing the raw waveform, not the transcript.

Model architecture

Backbone: Whisper Tiny encoder
Head: shallow linear classifier
Params: 8 M (int8)
Checkpoint: 8 MB ONNX

How to use

Please see the blog post and GitHub repo for more information on using the model, either standalone or with Pipecat.

pipecat-ai
/

smart-turn-v3

Smart Turn v3

Links

Model architecture

How to use

Datasets used to train pipecat-ai/smart-turn-v3

Smart Turn v3

Links

Model architecture

How to use

Datasets used to train pipecat-ai/smart-turn-v3

Smart Turn v3