Gemma 3n E4B It
Generate text based on images and videos
Real Time Communication for AI apps in Python
Generate text based on images and videos
Talk or type to ANY LLM!
Talk to OpenAI (Gradio UI)
Gemini understands audio and video!
Talk to OpenAI using their multimodal API
Talk with Qwen Omni over the Phone
Llama 3.2 - SambaNova API (Gradio)
Llama 3.2 - SambaNova API
Talk to Llama 4 using Groq + Cloudflare
Talk to Gemini (Gradio UI)
Talk to Gemini using Google's multimodal API
Transcribe audio in realtime - Gradio UI version
Transcribe audio in realtime with Whisper
Llama 3.2 - SambaNova API
Talk to your dog!
Llama 3.2 - SambaNova API
Turn Credentials Powered by Cloudflare โ๏ธโก๏ธ
FastRTC Voice Agent with smolagents
Have two Gemini agents talk to each other
Real-time captions with Moonshine ONNX
Create interactive HTML web pages with your voice
Say computer (Gradio)
Say computer before asking your question
Detect objects in live video feeds