xiaozhi / README.en.md
nzjsdsk's picture
Upload 169 files
27e74f3 verified
|
raw
history blame
9.01 kB
# py-xiaozhi
<p align="center">
<a href="https://github.com/huangjunsen0406/py-xiaozhi/releases/latest">
<img src="https://img.shields.io/github/v/release/huangjunsen0406/py-xiaozhi?style=flat-square&logo=github&color=blue" alt="Release"/>
</a>
<a href="https://opensource.org/licenses/MIT">
<img src="https://img.shields.io/badge/License-MIT-green.svg?style=flat-square" alt="License: MIT"/>
</a>
<a href="https://github.com/huangjunsen0406/py-xiaozhi/stargazers">
<img src="https://img.shields.io/github/stars/huangjunsen0406/py-xiaozhi?style=flat-square&logo=github" alt="Stars"/>
</a>
<a href="https://github.com/huangjunsen0406/py-xiaozhi/releases/latest">
<img src="https://img.shields.io/github/downloads/huangjunsen0406/py-xiaozhi/total?style=flat-square&logo=github&color=52c41a1&maxAge=86400" alt="Download"/>
</a>
<a href="https://gitee.com/huang-jun-sen/py-xiaozhi">
<img src="https://img.shields.io/badge/Gitee-FF5722?style=flat-square&logo=gitee" alt="Gitee"/>
</a>
</p>
English | [็ฎ€ไฝ“ไธญๆ–‡](README.md)
## Project Introduction
py-xiaozhi is a Python-based Xiaozhi voice client, designed to learn coding and experience AI voice interaction without hardware requirements. This repository is ported from [xiaozhi-esp32](https://github.com/78/xiaozhi-esp32).
## Demo
- [Bilibili Demo Video](https://www.bilibili.com/video/BV1HmPjeSED2/#reply255921347937)
![Image](https://github.com/user-attachments/assets/df8bd5d2-a8e6-4203-8084-46789fc8e9ad)
## Features
- **AI Voice Interaction**: Supports voice input and recognition, enabling smart human-computer interaction with natural conversation flow.
- **Visual Multimodal**: Supports image recognition and processing, providing multimodal interaction capabilities and image content understanding.
- **IoT Device Integration**: Supports smart home device control, enabling more IoT functions and building a smart home ecosystem.
- **Online Music Playback**: Advanced Music Player: A high-performance music player built on Pygame, supporting play/pause/stop, progress control, lyric display, and local caching, delivering a more stable and smooth listening experience.
- **Voice Wake-up**: Supports wake word activation, eliminating manual operation (disabled by default, manual activation required).
- **Auto Dialogue Mode**: Implements continuous dialogue experience, enhancing user interaction fluidity.
- **Graphical Interface**: Provides intuitive GUI with Xiaozhi expressions and text display, enhancing visual experience.
- **Command Line Mode**: Supports CLI operation, suitable for embedded devices or environments without GUI.
- **Cross-platform Support**: Compatible with Windows 10+, macOS 10.15+, and Linux systems for use anywhere.
- **Volume Control**: Supports volume adjustment to adapt to different environmental requirements with unified sound control interface.
- **Session Management**: Effectively manages multi-turn dialogues to maintain interaction continuity.
- **Encrypted Audio Transmission**: Supports WSS protocol to ensure audio data security and prevent information leakage.
- **Automatic Verification Code Handling**: Automatically copies verification codes and opens browsers during first use, simplifying user operations.
- **Automatic MAC Address Acquisition**: Avoids MAC address conflicts and improves connection stability.
- **Modular Code**: Code is split and encapsulated into classes with clear responsibilities, facilitating secondary development.
- **Stability Optimization**: Fixes multiple issues including reconnection and cross-platform compatibility.
## System Requirements
- Python version: 3.9 >= version <= 3.12
- Supported operating systems: Windows 10+, macOS 10.15+, Linux
- Microphone and speaker devices
## Read This First!
- Carefully read [้กน็›ฎๆ–‡ๆกฃ](https://huangjunsen0406.github.io/py-xiaozhi/) for startup tutorials and file descriptions
- The main branch has the latest code; manually reinstall pip dependencies after each update to ensure you have new dependencies
[Zero to Xiaozhi Client (Video Tutorial)](https://www.bilibili.com/video/BV1dWQhYEEmq/?vd_source=2065ec11f7577e7107a55bbdc3d12fce)
## State Transition Diagram
```
+----------------+
| |
v |
+------+ Wake/Button +------------+ | +------------+
| IDLE | -----------> | CONNECTING | --+-> | LISTENING |
+------+ +------------+ +------------+
^ |
| | Voice Recognition Complete
| +------------+ v
+--------- | SPEAKING | <-----------------+
Playback +------------+
Complete
```
## Upcoming Features
- [ ] **New GUI (Electron)**: Provides a more modern and beautiful user interface, optimizing the interaction experience.
## FAQ
- **Can't find audio device**: Please check if your microphone and speakers are properly connected and enabled.
- **Wake word not responding**: Check if the `USE_WAKE_WORD` setting in `config.json` is set to `true` and the model path is correct.
- **Network connection failure**: Check network settings and firewall configuration to ensure WebSocket or MQTT communication is not blocked.
- **Packaging failure**: Make sure PyInstaller is installed (`pip install pyinstaller`) and all dependencies are installed. Then re-execute `python scripts/build.py`
## Related Third-party Open Source Projects
[Xiaozhi Mobile Client](https://github.com/TOM88812/xiaozhi-android-client)
[xiaozhi-esp32-server (Open source server)](https://github.com/xinnan-tech/xiaozhi-esp32-server)
[XiaoZhiAI_server32_Unity(Unity Development)](https://gitee.com/vw112266/XiaoZhiAI_server32_Unity)
[IntelliConnect(Aiot Middleware)](https://github.com/ruanrongman/IntelliConnect)
[open-xiaoai(Xiaoai Audio Access Xiaozhi)](https://github.com/idootop/open-xiaoai.git)
## Related Branches
- main: Main branch
- feature/v1: First version
- feature/visual: Visual branch
- feature/raspberry_pi embedded device branch
## Project Structure
```
โ”œโ”€โ”€ .github # GitHub related configurations
โ”œโ”€โ”€ assets # Resource files (emotion animations, etc.)
โ”œโ”€โ”€ cache # Cache directory (music and temporary files)
โ”œโ”€โ”€ config # Configuration directory
โ”œโ”€โ”€ documents # Documentation directory
โ”œโ”€โ”€ hooks # PyInstaller hooks directory
โ”œโ”€โ”€ libs # Dependencies directory
โ”œโ”€โ”€ scripts # Utility scripts directory
โ”œโ”€โ”€ src # Source code directory
โ”‚ โ”œโ”€โ”€ audio_codecs # Audio encoding/decoding module
โ”‚ โ”œโ”€โ”€ audio_processing # Audio processing module
โ”‚ โ”œโ”€โ”€ constants # Constants definition
โ”‚ โ”œโ”€โ”€ display # Display interface module
โ”‚ โ”œโ”€โ”€ iot # IoT device related module
โ”‚ โ”‚ โ””โ”€โ”€ things # Specific device implementation directory
โ”‚ โ”œโ”€โ”€ network # Network communication module
โ”‚ โ”œโ”€โ”€ protocols # Communication protocol module
โ”‚ โ””โ”€โ”€ utils # Utility classes module
```
## Contribution Guidelines
We welcome issue reports and code contributions. Please ensure you follow these specifications:
1. Code style complies with PEP8 standards
2. PR submissions include appropriate tests
3. Update relevant documentation
## Community and Support
### Thanks to the Following Open Source Contributors
> In no particular order
[Xiaoxia](https://github.com/78)
[zhh827](https://github.com/zhh827)
[SmartArduino-Li Honggang](https://github.com/SmartArduino)
[HonestQiao](https://github.com/HonestQiao)
[vonweller](https://github.com/vonweller)
[Sun Weigong](https://space.bilibili.com/416954647)
[isamu2025](https://github.com/isamu2025)
[Rain120](https://github.com/Rain120)
[kejily](https://github.com/kejily)
[Radio bilibili Jun](https://space.bilibili.com/119751)
### Sponsorship Support
<div align="center">
<h3>Thanks to All Sponsors โค๏ธ</h3>
<p>Whether it's API resources, device compatibility testing, or financial support, every contribution makes the project more complete</p>
<a href="https://huangjunsen0406.github.io/py-xiaozhi/sponsors/" target="_blank">
<img src="https://img.shields.io/badge/View-Sponsors-brightgreen?style=for-the-badge&logo=github" alt="View Sponsors">
</a>
<a href="https://huangjunsen0406.github.io/py-xiaozhi/sponsors/" target="_blank">
<img src="https://img.shields.io/badge/Become-Sponsor-orange?style=for-the-badge&logo=heart" alt="Become a Sponsor">
</a>
</div>
## Project Statistics
[![Star History Chart](https://api.star-history.com/svg?repos=huangjunsen0406/py-xiaozhi&type=Date)](https://www.star-history.com/#huangjunsen0406/py-xiaozhi&Date)
## License
[MIT License](LICENSE)