|
# py-xiaozhi |
|
<p align="center"> |
|
<a href="https://github.com/huangjunsen0406/py-xiaozhi/releases/latest"> |
|
<img src="https://img.shields.io/github/v/release/huangjunsen0406/py-xiaozhi?style=flat-square&logo=github&color=blue" alt="Release"/> |
|
</a> |
|
<a href="https://opensource.org/licenses/MIT"> |
|
<img src="https://img.shields.io/badge/License-MIT-green.svg?style=flat-square" alt="License: MIT"/> |
|
</a> |
|
<a href="https://github.com/huangjunsen0406/py-xiaozhi/stargazers"> |
|
<img src="https://img.shields.io/github/stars/huangjunsen0406/py-xiaozhi?style=flat-square&logo=github" alt="Stars"/> |
|
</a> |
|
<a href="https://github.com/huangjunsen0406/py-xiaozhi/releases/latest"> |
|
<img src="https://img.shields.io/github/downloads/huangjunsen0406/py-xiaozhi/total?style=flat-square&logo=github&color=52c41a1&maxAge=86400" alt="Download"/> |
|
</a> |
|
<a href="https://gitee.com/huang-jun-sen/py-xiaozhi"> |
|
<img src="https://img.shields.io/badge/Gitee-FF5722?style=flat-square&logo=gitee" alt="Gitee"/> |
|
</a> |
|
</p> |
|
|
|
English | [็ฎไฝไธญๆ](README.md) |
|
|
|
## Project Introduction |
|
py-xiaozhi is a Python-based Xiaozhi voice client, designed to learn coding and experience AI voice interaction without hardware requirements. This repository is ported from [xiaozhi-esp32](https://github.com/78/xiaozhi-esp32). |
|
|
|
## Demo |
|
- [Bilibili Demo Video](https://www.bilibili.com/video/BV1HmPjeSED2/#reply255921347937) |
|
|
|
 |
|
|
|
## Features |
|
- **AI Voice Interaction**: Supports voice input and recognition, enabling smart human-computer interaction with natural conversation flow. |
|
- **Visual Multimodal**: Supports image recognition and processing, providing multimodal interaction capabilities and image content understanding. |
|
- **IoT Device Integration**: Supports smart home device control, enabling more IoT functions and building a smart home ecosystem. |
|
- **Online Music Playback**: Advanced Music Player: A high-performance music player built on Pygame, supporting play/pause/stop, progress control, lyric display, and local caching, delivering a more stable and smooth listening experience. |
|
- **Voice Wake-up**: Supports wake word activation, eliminating manual operation (disabled by default, manual activation required). |
|
- **Auto Dialogue Mode**: Implements continuous dialogue experience, enhancing user interaction fluidity. |
|
- **Graphical Interface**: Provides intuitive GUI with Xiaozhi expressions and text display, enhancing visual experience. |
|
- **Command Line Mode**: Supports CLI operation, suitable for embedded devices or environments without GUI. |
|
- **Cross-platform Support**: Compatible with Windows 10+, macOS 10.15+, and Linux systems for use anywhere. |
|
- **Volume Control**: Supports volume adjustment to adapt to different environmental requirements with unified sound control interface. |
|
- **Session Management**: Effectively manages multi-turn dialogues to maintain interaction continuity. |
|
- **Encrypted Audio Transmission**: Supports WSS protocol to ensure audio data security and prevent information leakage. |
|
- **Automatic Verification Code Handling**: Automatically copies verification codes and opens browsers during first use, simplifying user operations. |
|
- **Automatic MAC Address Acquisition**: Avoids MAC address conflicts and improves connection stability. |
|
- **Modular Code**: Code is split and encapsulated into classes with clear responsibilities, facilitating secondary development. |
|
- **Stability Optimization**: Fixes multiple issues including reconnection and cross-platform compatibility. |
|
|
|
## System Requirements |
|
- Python version: 3.9 >= version <= 3.12 |
|
- Supported operating systems: Windows 10+, macOS 10.15+, Linux |
|
- Microphone and speaker devices |
|
|
|
## Read This First! |
|
- Carefully read [้กน็ฎๆๆกฃ](https://huangjunsen0406.github.io/py-xiaozhi/) for startup tutorials and file descriptions |
|
- The main branch has the latest code; manually reinstall pip dependencies after each update to ensure you have new dependencies |
|
|
|
[Zero to Xiaozhi Client (Video Tutorial)](https://www.bilibili.com/video/BV1dWQhYEEmq/?vd_source=2065ec11f7577e7107a55bbdc3d12fce) |
|
|
|
## State Transition Diagram |
|
|
|
``` |
|
+----------------+ |
|
| | |
|
v | |
|
+------+ Wake/Button +------------+ | +------------+ |
|
| IDLE | -----------> | CONNECTING | --+-> | LISTENING | |
|
+------+ +------------+ +------------+ |
|
^ | |
|
| | Voice Recognition Complete |
|
| +------------+ v |
|
+--------- | SPEAKING | <-----------------+ |
|
Playback +------------+ |
|
Complete |
|
``` |
|
|
|
## Upcoming Features |
|
- [ ] **New GUI (Electron)**: Provides a more modern and beautiful user interface, optimizing the interaction experience. |
|
|
|
## FAQ |
|
- **Can't find audio device**: Please check if your microphone and speakers are properly connected and enabled. |
|
- **Wake word not responding**: Check if the `USE_WAKE_WORD` setting in `config.json` is set to `true` and the model path is correct. |
|
- **Network connection failure**: Check network settings and firewall configuration to ensure WebSocket or MQTT communication is not blocked. |
|
- **Packaging failure**: Make sure PyInstaller is installed (`pip install pyinstaller`) and all dependencies are installed. Then re-execute `python scripts/build.py` |
|
|
|
## Related Third-party Open Source Projects |
|
[Xiaozhi Mobile Client](https://github.com/TOM88812/xiaozhi-android-client) |
|
|
|
[xiaozhi-esp32-server (Open source server)](https://github.com/xinnan-tech/xiaozhi-esp32-server) |
|
|
|
[XiaoZhiAI_server32_Unity(Unity Development)](https://gitee.com/vw112266/XiaoZhiAI_server32_Unity) |
|
|
|
[IntelliConnect(Aiot Middleware)](https://github.com/ruanrongman/IntelliConnect) |
|
|
|
[open-xiaoai(Xiaoai Audio Access Xiaozhi)](https://github.com/idootop/open-xiaoai.git) |
|
|
|
## Related Branches |
|
- main: Main branch |
|
- feature/v1: First version |
|
- feature/visual: Visual branch |
|
- feature/raspberry_pi embedded device branch |
|
## Project Structure |
|
|
|
``` |
|
โโโ .github # GitHub related configurations |
|
โโโ assets # Resource files (emotion animations, etc.) |
|
โโโ cache # Cache directory (music and temporary files) |
|
โโโ config # Configuration directory |
|
โโโ documents # Documentation directory |
|
โโโ hooks # PyInstaller hooks directory |
|
โโโ libs # Dependencies directory |
|
โโโ scripts # Utility scripts directory |
|
โโโ src # Source code directory |
|
โ โโโ audio_codecs # Audio encoding/decoding module |
|
โ โโโ audio_processing # Audio processing module |
|
โ โโโ constants # Constants definition |
|
โ โโโ display # Display interface module |
|
โ โโโ iot # IoT device related module |
|
โ โ โโโ things # Specific device implementation directory |
|
โ โโโ network # Network communication module |
|
โ โโโ protocols # Communication protocol module |
|
โ โโโ utils # Utility classes module |
|
``` |
|
|
|
## Contribution Guidelines |
|
We welcome issue reports and code contributions. Please ensure you follow these specifications: |
|
|
|
1. Code style complies with PEP8 standards |
|
2. PR submissions include appropriate tests |
|
3. Update relevant documentation |
|
|
|
## Community and Support |
|
|
|
### Thanks to the Following Open Source Contributors |
|
> In no particular order |
|
|
|
[Xiaoxia](https://github.com/78) |
|
[zhh827](https://github.com/zhh827) |
|
[SmartArduino-Li Honggang](https://github.com/SmartArduino) |
|
[HonestQiao](https://github.com/HonestQiao) |
|
[vonweller](https://github.com/vonweller) |
|
[Sun Weigong](https://space.bilibili.com/416954647) |
|
[isamu2025](https://github.com/isamu2025) |
|
[Rain120](https://github.com/Rain120) |
|
[kejily](https://github.com/kejily) |
|
[Radio bilibili Jun](https://space.bilibili.com/119751) |
|
|
|
### Sponsorship Support |
|
|
|
<div align="center"> |
|
<h3>Thanks to All Sponsors โค๏ธ</h3> |
|
<p>Whether it's API resources, device compatibility testing, or financial support, every contribution makes the project more complete</p> |
|
|
|
<a href="https://huangjunsen0406.github.io/py-xiaozhi/sponsors/" target="_blank"> |
|
<img src="https://img.shields.io/badge/View-Sponsors-brightgreen?style=for-the-badge&logo=github" alt="View Sponsors"> |
|
</a> |
|
<a href="https://huangjunsen0406.github.io/py-xiaozhi/sponsors/" target="_blank"> |
|
<img src="https://img.shields.io/badge/Become-Sponsor-orange?style=for-the-badge&logo=heart" alt="Become a Sponsor"> |
|
</a> |
|
</div> |
|
|
|
## Project Statistics |
|
[](https://www.star-history.com/#huangjunsen0406/py-xiaozhi&Date) |
|
|
|
## License |
|
[MIT License](LICENSE) |