--- title: Voxtral emoji: ⚡ colorFrom: gray colorTo: green sdk: gradio sdk_version: 5.38.0 app_file: app.py pinned: false license: apache-2.0 short_description: Chat and transcribe audio files with AI, powered by Voxtral. --- # Voxtral Pro Interface
![Python](https://img.shields.io/badge/Python-3.9+-blue?logo=python&logoColor=white) ![Gradio](https://img.shields.io/badge/Gradio-5.37-orange?logo=gradio) ![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg) ![Hugging Face Spaces](https://img.shields.io/badge/🤗%20Hugging%20Face-Spaces-yellow)

An advanced, feature-rich Gradio UI to explore the full power of Mistral AI's multimodal model, `voxtral`.

Voxtral Pro Demo

Voxtral Pro Demo

## 🚀 About The Project Voxtral Pro was created to explore and showcase the full range of capabilities of Mistral AI's powerful multimodal model, `voxtral`. This application goes beyond a simple chat interface to provide a comprehensive toolkit for interacting with audio and text, demonstrating features like high-quality transcription, multi-turn multimodal conversation, and agent-like tool use. This project serves as a practical example of how to build robust, user-friendly, and production-ready applications on top of state-of-the-art foundation models. ## ✨ Key Features * **🎙️ High-Quality Transcription:** Transcribe large audio files with exceptional accuracy using the Mistral API. * **📄 SRT Subtitle Generation:** Automatically generate and export `.srt` subtitle files with precise segment timestamps, perfect for content creators. * **💬 Multimodal Chat:** Engage in rich, multi-turn conversations combining both text and audio inputs simultaneously. * **🤖 Tool Use / Function Calling:** Demonstrates the model's ability to call external functions to retrieve information (e.g., getting city data), showcasing its agent-like capabilities. * **🔐 Secure API Key Handling:** Your Mistral API key is stored securely in your browser's session storage and is never exposed or saved elsewhere. * **🎨 Modern UI:** A clean, responsive, and aesthetically pleasing interface built with Gradio. ## 🛠️ Tech Stack This project is built with a modern, asynchronous Python stack: * **Backend:** [Python](https://www.python.org/) * **Web Framework:** [Gradio](https://www.gradio.app/) * **API Client:** [httpx](https://www.python-httpx.org/) with `asyncio` for non-blocking API calls. * **Deployment:** [Hugging Face Spaces](https://huggingface.co/spaces) ## 🏁 Getting Started Follow these instructions to get a local copy up and running. ### Prerequisites * Python 3.9+ * Git ### Installation & Configuration 1. **Clone the repository:** git clone [https://huggingface.co/spaces/hasanbasbunar/Voxtral](https://huggingface.co/spaces/hasanbasbunar/Voxtral) && cd Voxtral 2. **Create and activate a virtual environment:** ```sh python3 -m venv .venv source .venv/bin/activate ``` 3. **Install dependencies:** ```sh pip install -r requirements.txt ``` 4. **Configure your API Key:** Create a file named `.env` in the root of the project and add your Mistral API key: ``` MISTRAL_API_KEY="your_api_key_here" ``` *The application is also designed to let you enter the key directly in the UI if you prefer not to use an `.env` file.* ### Running the Application 1. **Launch the app:** ```sh python app.py ``` 2. Open your browser and navigate to `http://127.0.0.1:7860`. ## 🚢 Deployment This app is designed to be easily deployed. It is currently live on [Hugging Face Spaces](https://huggingface.co/spaces/hasanbasbunar/Voxtral). To deploy your own version, you can use any platform that supports Python applications. For a production environment, ensure `debug=False` in `app.py`. Example for platforms that use a `PORT` environment variable: ```python # in app.py demo.launch(server_port=int(os.environ.get("PORT", 7860)), debug=False)