# py-xiaozhi

Release License: MIT Stars Download Gitee

English | [简体中文](README.md) ## Project Introduction py-xiaozhi is a Python-based Xiaozhi voice client, designed to learn coding and experience AI voice interaction without hardware requirements. This repository is ported from [xiaozhi-esp32](https://github.com/78/xiaozhi-esp32). ## Demo - [Bilibili Demo Video](https://www.bilibili.com/video/BV1HmPjeSED2/#reply255921347937) ![Image](https://github.com/user-attachments/assets/df8bd5d2-a8e6-4203-8084-46789fc8e9ad) ## Features - **AI Voice Interaction**: Supports voice input and recognition, enabling smart human-computer interaction with natural conversation flow. - **Visual Multimodal**: Supports image recognition and processing, providing multimodal interaction capabilities and image content understanding. - **IoT Device Integration**: Supports smart home device control, enabling more IoT functions and building a smart home ecosystem. - **Online Music Playback**: Advanced Music Player: A high-performance music player built on Pygame, supporting play/pause/stop, progress control, lyric display, and local caching, delivering a more stable and smooth listening experience. - **Voice Wake-up**: Supports wake word activation, eliminating manual operation (disabled by default, manual activation required). - **Auto Dialogue Mode**: Implements continuous dialogue experience, enhancing user interaction fluidity. - **Graphical Interface**: Provides intuitive GUI with Xiaozhi expressions and text display, enhancing visual experience. - **Command Line Mode**: Supports CLI operation, suitable for embedded devices or environments without GUI. - **Cross-platform Support**: Compatible with Windows 10+, macOS 10.15+, and Linux systems for use anywhere. - **Volume Control**: Supports volume adjustment to adapt to different environmental requirements with unified sound control interface. - **Session Management**: Effectively manages multi-turn dialogues to maintain interaction continuity. - **Encrypted Audio Transmission**: Supports WSS protocol to ensure audio data security and prevent information leakage. - **Automatic Verification Code Handling**: Automatically copies verification codes and opens browsers during first use, simplifying user operations. - **Automatic MAC Address Acquisition**: Avoids MAC address conflicts and improves connection stability. - **Modular Code**: Code is split and encapsulated into classes with clear responsibilities, facilitating secondary development. - **Stability Optimization**: Fixes multiple issues including reconnection and cross-platform compatibility. ## System Requirements - Python version: 3.9 >= version <= 3.12 - Supported operating systems: Windows 10+, macOS 10.15+, Linux - Microphone and speaker devices ## Read This First! - Carefully read [项目文档](https://huangjunsen0406.github.io/py-xiaozhi/) for startup tutorials and file descriptions - The main branch has the latest code; manually reinstall pip dependencies after each update to ensure you have new dependencies [Zero to Xiaozhi Client (Video Tutorial)](https://www.bilibili.com/video/BV1dWQhYEEmq/?vd_source=2065ec11f7577e7107a55bbdc3d12fce) ## State Transition Diagram ``` +----------------+ | | v | +------+ Wake/Button +------------+ | +------------+ | IDLE | -----------> | CONNECTING | --+-> | LISTENING | +------+ +------------+ +------------+ ^ | | | Voice Recognition Complete | +------------+ v +--------- | SPEAKING | <-----------------+ Playback +------------+ Complete ``` ## Upcoming Features - [ ] **New GUI (Electron)**: Provides a more modern and beautiful user interface, optimizing the interaction experience. ## FAQ - **Can't find audio device**: Please check if your microphone and speakers are properly connected and enabled. - **Wake word not responding**: Check if the `USE_WAKE_WORD` setting in `config.json` is set to `true` and the model path is correct. - **Network connection failure**: Check network settings and firewall configuration to ensure WebSocket or MQTT communication is not blocked. - **Packaging failure**: Make sure PyInstaller is installed (`pip install pyinstaller`) and all dependencies are installed. Then re-execute `python scripts/build.py` ## Related Third-party Open Source Projects [Xiaozhi Mobile Client](https://github.com/TOM88812/xiaozhi-android-client) [xiaozhi-esp32-server (Open source server)](https://github.com/xinnan-tech/xiaozhi-esp32-server) [XiaoZhiAI_server32_Unity(Unity Development)](https://gitee.com/vw112266/XiaoZhiAI_server32_Unity) [IntelliConnect(Aiot Middleware)](https://github.com/ruanrongman/IntelliConnect) [open-xiaoai(Xiaoai Audio Access Xiaozhi)](https://github.com/idootop/open-xiaoai.git) ## Related Branches - main: Main branch - feature/v1: First version - feature/visual: Visual branch - feature/raspberry_pi embedded device branch ## Project Structure ``` ├── .github # GitHub related configurations ├── assets # Resource files (emotion animations, etc.) ├── cache # Cache directory (music and temporary files) ├── config # Configuration directory ├── documents # Documentation directory ├── hooks # PyInstaller hooks directory ├── libs # Dependencies directory ├── scripts # Utility scripts directory ├── src # Source code directory │ ├── audio_codecs # Audio encoding/decoding module │ ├── audio_processing # Audio processing module │ ├── constants # Constants definition │ ├── display # Display interface module │ ├── iot # IoT device related module │ │ └── things # Specific device implementation directory │ ├── network # Network communication module │ ├── protocols # Communication protocol module │ └── utils # Utility classes module ``` ## Contribution Guidelines We welcome issue reports and code contributions. Please ensure you follow these specifications: 1. Code style complies with PEP8 standards 2. PR submissions include appropriate tests 3. Update relevant documentation ## Community and Support ### Thanks to the Following Open Source Contributors > In no particular order [Xiaoxia](https://github.com/78) [zhh827](https://github.com/zhh827) [SmartArduino-Li Honggang](https://github.com/SmartArduino) [HonestQiao](https://github.com/HonestQiao) [vonweller](https://github.com/vonweller) [Sun Weigong](https://space.bilibili.com/416954647) [isamu2025](https://github.com/isamu2025) [Rain120](https://github.com/Rain120) [kejily](https://github.com/kejily) [Radio bilibili Jun](https://space.bilibili.com/119751) ### Sponsorship Support

Thanks to All Sponsors ❤️

Whether it's API resources, device compatibility testing, or financial support, every contribution makes the project more complete

View Sponsors Become a Sponsor
## Project Statistics [![Star History Chart](https://api.star-history.com/svg?repos=huangjunsen0406/py-xiaozhi&type=Date)](https://www.star-history.com/#huangjunsen0406/py-xiaozhi&Date) ## License [MIT License](LICENSE)