--- title: ML6-Gemini-Demo app_file: src/app.py sdk: gradio sdk_version: 5.23.0 --- # Gemini Voice Agent Demo This repo contains a demo using the Gemini MultiModal API to create a voice-based agent that can conduct professional technical screening interviews. ## Technical Overview The system is based on FastRTC and Gradio to provide a real-time voice UI. ### About the modality You can configure the output modality: - If set to AUDIO - The agent will respond with an audio response. - There is no text output so no transcription if set to TEXT - The agent will respond with a text response. - The text output will be transcribed to audio using the TTS API. - Transcriptions are available. ### Function Calling There are 2 functions that can be called: - Answer validation - will check the answer type vs the expected type - will store the answer - Log Input - will log the user input - this is a form of transcribing the incoming audio ## Getting Started To run the application, follow these steps: 1. Install uv (if not already installed): `curl -LsSf https://astral.sh/uv/install.sh | sh` 2. Install dependencies: `uv sync` 3. Setup the environment variables for either GenAI or VertexAI (see below) 4. Run the application: `python src/app.py` 5. Visit `http://127.0.0.1:7860` in your browser to interact with the voice agent. ### GenAI vs VertexAI "gemini-2.0-flash-exp" can be used in both GenAI and VertexAI. [more info](https://github.com/heiko-hotz/gemini-multimodal-live-dev-guide?tab=readme-ov-file) - GenAI requires just a GEMINI_API_KEY environment variable [link](https://ai.google.dev/gemini-api/docs/api-key) - VertexAI requires a GCP project and the following environment variables: ``` export GOOGLE_CLOUD_PROJECT=YOUR_PROJECT_ID export GOOGLE_CLOUD_LOCATION=europe-west4 export GOOGLE_GENAI_USE_VERTEXAI=True ``` Depending `GOOGLE_GENAI_USE_VERTEXAI` flag this demo will use either GenAI or VertexAI. ### Note The gradio-webrtc install fails unless you have ffmpeg@6, on mac: ``` brew uninstall ffmpeg brew install ffmpeg@6 brew link ffmpeg@6 ```