mWhisper-Flamingo for Multilingual Audio-Visual Noise-Robust Speech Recognition Paper • 2502.01547 • Published Feb 3
Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM? Paper • 2505.09439 • Published 3 days ago • 6 • 2
Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation Paper • 2406.10082 • Published Jun 14, 2024 • 1
ibm-granite/granite-speech-3.3-8b Automatic Speech Recognition • Updated about 7 hours ago • 6.27k • 43
voidful/wav2vec2-xlsr-multilingual-56 Automatic Speech Recognition • Updated Mar 18, 2023 • 15.8k • 30