Language detection
#7
by
hemant75
- opened
How to detect the language dynamically from a feed? There are two possible languages a feed can have - Hindi / English.. pls guide
hemant75
changed discussion status to
closed
hemant75
changed discussion status to
open
You can use this code:
from scipy.io import wavfile
def limit_languages(audio, allowed_languages: list=["en", "hi"]):
sampling_rate, audio_data = wavfile.read(audio)
model = WhisperModel("large-v2", device="cpu", compute_type="int8")
language, language_probability, all_language_probs = model.detect_language(audio_data)
score = 0
for language_code, language_prob in all_language_probs:
for allowed_language in allowed_languages:
if language_code == allowed_language:
if language_prob > score:
score = language_prob
detected_language = language_code
return detected_language
https://github.com/SYSTRAN/faster-whisper/issues/1164#issuecomment-2495601955