SmolVLM: Redefining small and efficient multimodal models Paper โข 2504.05299 โข Published 8 days ago โข 158
SmolVLM: Redefining small and efficient multimodal models Paper โข 2504.05299 โข Published 8 days ago โข 158
distil-whisper/distil-large-v3.5-ONNX Automatic Speech Recognition โข Updated 21 days ago โข 21 โข 1
distil-large-v3.5 Collection This collection contains the model repositories for distil-large-v3.5, which provides support for the most popular Whisper libraries. โข 5 items โข Updated 21 days ago โข 7
distil-whisper/distil-large-v3.5-ONNX Automatic Speech Recognition โข Updated 21 days ago โข 21 โข 1
distil-whisper/distil-large-v3.5-ct2 Automatic Speech Recognition โข Updated 27 days ago โข 216 โข 2
distil-whisper/distil-large-v3.5 Automatic Speech Recognition โข Updated 21 days ago โข 4.76k โข โข 22
distil-large-v3.5 Collection This collection contains the model repositories for distil-large-v3.5, which provides support for the most popular Whisper libraries. โข 5 items โข Updated 21 days ago โข 7
view post Post 12920 We did it. Kokoro TTS (v1.0) can now run 100% locally in your browser w/ WebGPU acceleration. Real-time text-to-speech without a server. โก๏ธGenerate 10 seconds of speech in ~1 second for $0.What will you build? ๐ฅ webml-community/kokoro-webgpuThe most difficult part was getting the model running in the first place, but the next steps are simple:โ๏ธ Implement sentence splitting, allowing for streamed responses๐ Multilingual support (only phonemization left)Who wants to help? See translation 11 replies ยท ๐ฅ 31 31 ๐ 14 14 ๐ 7 7 ๐ค 5 5 ๐ 2 2 + Reply
High-Fidelity Simultaneous Speech-To-Speech Translation Paper โข 2502.03382 โข Published Feb 5 โข 8