Skip to content

Commit 941be2f

Browse files
committed
Merge branch 'main' into remote_ctrl
2 parents 6071962 + 6896246 commit 941be2f

File tree

249 files changed

+18898
-1361
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

249 files changed

+18898
-1361
lines changed

.gitignore

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@ sdkconfig.old
77
sdkconfig
88
dependencies.lock
99
.env
10-
.DS_Store
1110
releases/
12-
.cache/
11+
main/assets/lang_config.h
12+
main/mmap_generate_emoji.h
13+
.DS_Store
14+
.cache
15+
main/mmap_generate_emoji.h

CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
# CMakeLists in this exact order for cmake to work correctly
55
cmake_minimum_required(VERSION 3.16)
66

7-
set(PROJECT_VER "1.6.2")
7+
set(PROJECT_VER "1.7.0")
88

99
# Add this line to disable the specific warning
1010
add_compile_options(-Wno-missing-field-initializers)

README.md

Lines changed: 13 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,7 @@
77

88
👉 [视频介绍【bilibili】](https://www.bilibili.com/video/BV1icXPYVEMN/)
99

10-
👉 [ESP32+SenseVoice+Qwen72B打造你的AI聊天伴侣!【bilibili】](https://www.bilibili.com/video/BV11msTenEH3/)
11-
12-
👉 [给小智装上 DeepSeek 的聪明大脑【bilibili】](https://www.bilibili.com/video/BV1GQP6eNEFG/)
10+
👉 [人类:给 AI 装摄像头 vs AI:当场发现主人三天没洗头【bilibili】](https://www.bilibili.com/video/BV1bpjgzKEhd/)
1311

1412
👉 [AI-01模组使用手册](docs/AI-01_使用手册.pdf)
1513

@@ -32,21 +30,20 @@
3230
已实现功能
3331

3432
- Wi-Fi / ML307 Cat.1 4G
35-
- BOOT 键唤醒和打断,支持点击和长按两种触发方式
3633
- 离线语音唤醒 [ESP-SR](https://github.com/espressif/esp-sr)
37-
- 流式语音对话(WebSocket 或 UDP 协议
38-
- 支持国语、粤语、英语、日语、韩语 5 种语言识别 [SenseVoice](https://github.com/FunAudioLLM/SenseVoice)
39-
- 声纹识别,识别是谁在喊 AI 的名字 [3D Speaker](https://github.com/modelscope/3D-Speaker)
40-
- 大模型 TTS(火山引擎 或 CosyVoice)
41-
- 大模型 LLM(Qwen, DeepSeek, Doubao)
42-
- 可配置的提示词和音色(自定义角色)
43-
- 短期记忆,每轮对话后自我总结
44-
- OLED / LCD 显示屏,显示信号强弱或对话内容
45-
- 支持 LCD 显示图片表情
46-
- 支持多语言(中文、英文
34+
- 支持两种通信协议([Websocket](docs/websocket.md)MQTT+UDP)
35+
- 采用 OPUS 音频编解码
36+
- 基于流式 ASR + LLM + TTS 架构的语音交互
37+
- 声纹识别,识别当前说话人的身份 [3D Speaker](https://github.com/modelscope/3D-Speaker)
38+
- OLED / LCD 显示屏,支持表情显示
39+
- 电量显示与电源管理
40+
- 支持多语言(中文、英文、日文)
41+
- 支持 ESP32-C3、ESP32-S3、ESP32-P4 芯片平台
42+
- 通过设备端 MCP 实现设备控制(音量、灯光、电机、GPIO 等)
43+
- 通过云端 MCP 扩展大模型能力(智能家居控制、PC桌面操作、知识搜索、邮件收发等
4744

4845
## 软件部分
49-
* ESP-IDF需要在5.3以上,推荐版本为5.3,参考[官方指南](https://docs.espressif.com/projects/esp-idf/zh_CN/latest/esp32c2/get-started/index.html)
46+
* ESP-IDF需要在5.4以上,推荐版本为5.4,参考[官方指南](https://docs.espressif.com/projects/esp-idf/zh_CN/latest/esp32c2/get-started/index.html)
5047
* 编译
5148
```
5249
idf.py @main/boards/doit-ai-01-kit/boards.cfg build
@@ -110,6 +107,7 @@
110107
| 你好小智 | |
111108
| 小艾小艾(需升级支持) | Hey Alice(需升级支持) |
112109
110+
👉 [新手烧录固件教程](https://ccnphfhqs21z.feishu.cn/wiki/Zpz4wXBtdimBrLk25WdcXzxcnNS)
113111
114112
### 机器人
115113

README_en.md

Lines changed: 66 additions & 60 deletions
Original file line numberDiff line numberDiff line change
@@ -1,68 +1,68 @@
1-
# XiaoZhi AI Chatbot
1+
# An MCP-based Chatbot
22

3-
([中文](README.md) | English | [日本語](README_ja.md))
3+
(English | [中文](README.md) | [日本語](README_ja.md))
44

5-
## Introduction
5+
## Video
6+
7+
👉 [Human: Give AI a camera vs AI: Instantly finds out the owner hasn't washed hair for three days【bilibili】](https://www.bilibili.com/video/BV1bpjgzKEhd/)
68

7-
👉 [Build your AI chat companion with ESP32+SenseVoice+Qwen72B!【bilibili】](https://www.bilibili.com/video/BV11msTenEH3/)
9+
👉 [Handcraft your AI girlfriend, beginner's guide【bilibili】](https://www.bilibili.com/video/BV1XnmFYLEJN/)
810

9-
👉 [Equipping XiaoZhi with DeepSeek's smart brain【bilibili】](https://www.bilibili.com/video/BV1GQP6eNEFG/)
11+
## Introduction
1012

11-
👉 [Build your own AI companion, a beginner's guide【bilibili】](https://www.bilibili.com/video/BV1XnmFYLEJN/)
13+
This is an open-source ESP32 project, released under the MIT license, allowing anyone to use it for free, including for commercial purposes.
1214

13-
## Project Purpose
15+
We hope this project helps everyone understand AI hardware development and apply rapidly evolving large language models to real hardware devices.
1416

15-
This is an open-source project released under the MIT license, allowing anyone to use it freely, including for commercial purposes.
17+
If you have any ideas or suggestions, please feel free to raise Issues or join the QQ group: 575180511
1618

17-
Through this project, we aim to help more people get started with AI hardware development and understand how to implement rapidly evolving large language models in actual hardware devices. Whether you're a student interested in AI or a developer exploring new technologies, this project offers valuable learning experiences.
19+
### Control Everything with MCP
1820

19-
Everyone is welcome to participate in the project's development and improvement. If you have any ideas or suggestions, please feel free to raise an Issue or join the chat group.
21+
As a voice interaction entry, the XiaoZhi AI chatbot leverages the AI capabilities of large models like Qwen / DeepSeek, and achieves multi-terminal control via the MCP protocol.
2022

21-
Learning & Discussion QQ Group: 376893254
23+
![Control everything via MCP](docs/mcp-based-graph.jpg)
2224

23-
## Implemented Features
25+
### Features Implemented
2426

2527
- Wi-Fi / ML307 Cat.1 4G
26-
- BOOT button wake-up and interruption, supporting both click and long-press triggers
2728
- Offline voice wake-up [ESP-SR](https://github.com/espressif/esp-sr)
28-
- Streaming voice dialogue (WebSocket or UDP protocol)
29-
- Support for 5 languages: Mandarin, Cantonese, English, Japanese, Korean [SenseVoice](https://github.com/FunAudioLLM/SenseVoice)
30-
- Voice print recognition to identify who's calling AI's name [3D Speaker](https://github.com/modelscope/3D-Speaker)
31-
- Large model TTS (Volcano Engine or CosyVoice)
32-
- Large Language Models (Qwen, DeepSeek, Doubao)
33-
- Configurable prompts and voice tones (custom characters)
34-
- Short-term memory, self-summarizing after each conversation round
35-
- OLED / LCD display showing signal strength or conversation content
36-
- Support for LCD image expressions
37-
- Multi-language support (Chinese, English)
38-
39-
## Hardware Section
29+
- Supports two communication protocols ([Websocket](docs/websocket.md) or MQTT+UDP)
30+
- Uses OPUS audio codec
31+
- Voice interaction based on streaming ASR + LLM + TTS architecture
32+
- Speaker recognition, identifies the current speaker [3D Speaker](https://github.com/modelscope/3D-Speaker)
33+
- OLED / LCD display, supports emoji display
34+
- Battery display and power management
35+
- Multi-language support (Chinese, English, Japanese)
36+
- Supports ESP32-C3, ESP32-S3, ESP32-P4 chip platforms
37+
- Device-side MCP for device control (Speaker, LED, Servo, GPIO, etc.)
38+
- Cloud-side MCP to extend large model capabilities (smart home control, PC desktop operation, knowledge search, email, etc.)
39+
40+
## Hardware
4041

4142
### Breadboard DIY Practice
4243

4344
See the Feishu document tutorial:
4445

45-
👉 [XiaoZhi AI Chatbot Encyclopedia](https://ccnphfhqs21z.feishu.cn/wiki/F5krwD16viZoF0kKkvDcrZNYnhb?from=from_copylink)
46+
👉 ["XiaoZhi AI Chatbot Encyclopedia"](https://ccnphfhqs21z.feishu.cn/wiki/F5krwD16viZoF0kKkvDcrZNYnhb?from=from_copylink)
4647

47-
Breadboard demonstration:
48+
Breadboard demo:
4849

49-
![Breadboard Demo](docs/wiring2.jpg)
50+
![Breadboard Demo](docs/v1/wiring2.jpg)
5051

51-
### Supported Open Source Hardware
52+
### Supports 70+ Open Source Hardware (Partial List)
5253

5354
- <a href="https://oshwhub.com/li-chuang-kai-fa-ban/li-chuang-shi-zhan-pai-esp32-s3-kai-fa-ban" target="_blank" title="LiChuang ESP32-S3 Development Board">LiChuang ESP32-S3 Development Board</a>
5455
- <a href="https://github.com/espressif/esp-box" target="_blank" title="Espressif ESP32-S3-BOX3">Espressif ESP32-S3-BOX3</a>
5556
- <a href="https://docs.m5stack.com/zh_CN/core/CoreS3" target="_blank" title="M5Stack CoreS3">M5Stack CoreS3</a>
56-
- <a href="https://docs.m5stack.com/en/atom/Atomic%20Echo%20Base" target="_blank" title="AtomS3R + Echo Base">AtomS3R + Echo Base</a>
57-
- <a href="https://docs.m5stack.com/en/core/ATOM%20Matrix" target="_blank" title="AtomMatrix + Echo Base">AtomMatrix + Echo Base</a>
57+
- <a href="https://docs.m5stack.com/en/atom/Atomic%20Echo%20Base" target="_blank" title="AtomS3R + Echo Base">M5Stack AtomS3R + Echo Base</a>
5858
- <a href="https://gf.bilibili.com/item/detail/1108782064" target="_blank" title="Magic Button 2.4">Magic Button 2.4</a>
5959
- <a href="https://www.waveshare.net/shop/ESP32-S3-Touch-AMOLED-1.8.htm" target="_blank" title="Waveshare ESP32-S3-Touch-AMOLED-1.8">Waveshare ESP32-S3-Touch-AMOLED-1.8</a>
6060
- <a href="https://github.com/Xinyuan-LilyGO/T-Circle-S3" target="_blank" title="LILYGO T-Circle-S3">LILYGO T-Circle-S3</a>
6161
- <a href="https://oshwhub.com/tenclass01/xmini_c3" target="_blank" title="XiaGe Mini C3">XiaGe Mini C3</a>
62-
- <a href="https://oshwhub.com/movecall/moji-xiaozhi-ai-derivative-editi" target="_blank" title="Movecall Moji ESP32S3">Moji XiaoZhi AI Derivative Version</a>
63-
- <a href="https://oshwhub.com/movecall/cuican-ai-pendant-lights-up-y" target="_blank" title="Movecall CuiCan ESP32S3">CuiCan AI pendant</a>
62+
- <a href="https://oshwhub.com/movecall/cuican-ai-pendant-lights-up-y" target="_blank" title="Movecall CuiCan ESP32S3">CuiCan AI Pendant</a>
6463
- <a href="https://github.com/WMnologo/xingzhi-ai" target="_blank" title="WMnologo-Xingzhi-1.54">WMnologo-Xingzhi-1.54TFT</a>
6564
- <a href="https://www.seeedstudio.com/SenseCAP-Watcher-W1-A-p-5979.html" target="_blank" title="SenseCAP Watcher">SenseCAP Watcher</a>
65+
- <a href="https://www.bilibili.com/video/BV1BHJtz6E2S/" target="_blank" title="ESP-HI Low Cost Robot Dog">ESP-HI Low Cost Robot Dog</a>
6666

6767
<div style="display: flex; justify-content: space-between;">
6868
<a href="docs/v1/lichuang-s3.jpg" target="_blank" title="LiChuang ESP32-S3 Development Board">
@@ -77,23 +77,17 @@ Breadboard demonstration:
7777
<a href="docs/v1/atoms3r.jpg" target="_blank" title="AtomS3R + Echo Base">
7878
<img src="docs/v1/atoms3r.jpg" width="240" />
7979
</a>
80-
<a href="docs/AtomMatrix-echo-base.jpg" target="_blank" title="AtomMatrix-echo-base + Echo Base">
81-
<img src="docs/AtomMatrix-echo-base.jpg" width="240" />
82-
</a>
83-
<a href="docs/v1/magiclick.jpg" target="_blank" title="MagiClick 2.4">
80+
<a href="docs/v1/magiclick.jpg" target="_blank" title="Magic Button 2.4">
8481
<img src="docs/v1/magiclick.jpg" width="240" />
8582
</a>
8683
<a href="docs/v1/waveshare.jpg" target="_blank" title="Waveshare ESP32-S3-Touch-AMOLED-1.8">
8784
<img src="docs/v1/waveshare.jpg" width="240" />
8885
</a>
89-
<a href="docs/lilygo-t-circle-s3.jpg" target="_blank" title="LILYGO T-Circle-S3">
90-
<img src="docs/lilygo-t-circle-s3.jpg" width="240" />
91-
</a>
92-
<a href="docs/xmini-c3.jpg" target="_blank" title="Xmini C3">
93-
<img src="docs/xmini-c3.jpg" width="240" />
86+
<a href="docs/v1/lilygo-t-circle-s3.jpg" target="_blank" title="LILYGO T-Circle-S3">
87+
<img src="docs/v1/lilygo-t-circle-s3.jpg" width="240" />
9488
</a>
95-
<a href="docs/v1/movecall-moji-esp32s3.jpg" target="_blank" title="Moji">
96-
<img src="http://github.com/docs/v1/%3Cspan%20class="x x-first x-last">movecall-moji-esp32s3.jpg" width="240" />
89+
<a href="docs/v1/xmini-c3.jpg" target="_blank" title="XiaGe Mini C3">
90+
<img src="http://github.com/docs/v1/%3Cspan%20class="x x-first x-last">xmini-c3.jpg" width="240" />
9791
</a>
9892
<a href="docs/v1/movecall-cuican-esp32s3.jpg" target="_blank" title="CuiCan">
9993
<img src="docs/v1/movecall-cuican-esp32s3.jpg" width="240" />
@@ -104,41 +98,53 @@ Breadboard demonstration:
10498
<a href="docs/v1/sensecap_watcher.jpg" target="_blank" title="SenseCAP Watcher">
10599
<img src="docs/v1/sensecap_watcher.jpg" width="240" />
106100
</a>
101+
<a href="docs/v1/esp-hi.jpg" target="_blank" title="ESP-HI Low Cost Robot Dog">
102+
<img src="docs/v1/esp-hi.jpg" width="240" />
103+
</a>
107104
</div>
108105

109-
## Firmware Section
106+
## Software
110107

111-
### Flashing Without Development Environment
108+
### Firmware Flashing
112109

113-
For beginners, it's recommended to first use the firmware that can be flashed without setting up a development environment.
110+
For beginners, it is recommended to use the firmware that can be flashed without setting up a development environment.
114111

115-
The firmware connects to the official [xiaozhi.me](https://xiaozhi.me) server by default. Currently, personal users can register an account to use the Qwen real-time model for free.
112+
The firmware connects to the official [xiaozhi.me](https://xiaozhi.me) server by default. Personal users can register an account to use the Qwen real-time model for free.
116113

117-
👉 [Flash Firmware Guide (No IDF Environment)](https://ccnphfhqs21z.feishu.cn/wiki/Zpz4wXBtdimBrLk25WdcXzxcnNS)
114+
👉 [Beginner's Firmware Flashing Guide](https://ccnphfhqs21z.feishu.cn/wiki/Zpz4wXBtdimBrLk25WdcXzxcnNS)
118115

119116
### Development Environment
120117

121118
- Cursor or VSCode
122-
- Install ESP-IDF plugin, select SDK version 5.3 or above
123-
- Linux is preferred over Windows for faster compilation and fewer driver issues
124-
- Use Google C++ code style, ensure compliance when submitting code
119+
- Install ESP-IDF plugin, select SDK version 5.4 or above
120+
- Linux is better than Windows for faster compilation and fewer driver issues
121+
- This project uses Google C++ code style, please ensure compliance when submitting code
125122

126123
### Developer Documentation
127124

128-
- [Board Customization Guide](main/boards/README.md) - Learn how to create custom board adaptations for XiaoZhi
129-
- [IoT Control Module](main/iot/README.md) - Understand how to control IoT devices through AI voice commands
125+
- [Custom Board Guide](main/boards/README.md) - Learn how to create custom boards for XiaoZhi AI
126+
- [MCP Protocol IoT Control Usage](docs/mcp-usage.md) - Learn how to control IoT devices via MCP protocol
127+
- [MCP Protocol Interaction Flow](docs/mcp-protocol.md) - Device-side MCP protocol implementation
128+
- [A detailed WebSocket communication protocol document](docs/websocket.md)
129+
130+
## Large Model Configuration
131+
132+
If you already have a XiaoZhi AI chatbot device and have connected to the official server, you can log in to the [xiaozhi.me](https://xiaozhi.me) console for configuration.
130133

131-
## AI Agent Configuration
134+
👉 [Backend Operation Video Tutorial (Old Interface)](https://www.bilibili.com/video/BV1jUCUY2EKM/)
132135

133-
If you already have a XiaoZhi AI chatbot device, you can configure it through the [xiaozhi.me](https://xiaozhi.me) console.
136+
## Related Open Source Projects
134137

135-
👉 [Backend Operation Tutorial (Old Interface)](https://www.bilibili.com/video/BV1jUCUY2EKM/)
138+
For server deployment on personal computers, refer to the following open-source projects:
136139

137-
## Technical Principles and Private Deployment
140+
- [xinnan-tech/xiaozhi-esp32-server](https://github.com/xinnan-tech/xiaozhi-esp32-server) Python server
141+
- [joey-zhou/xiaozhi-esp32-server-java](https://github.com/joey-zhou/xiaozhi-esp32-server-java) Java server
142+
- [AnimeAIChat/xiaozhi-server-go](https://github.com/AnimeAIChat/xiaozhi-server-go) Golang server
138143

139-
👉 [Detailed WebSocket Communication Protocol Documentation](docs/websocket.md)
144+
Other client projects using the XiaoZhi communication protocol:
140145

141-
For server deployment on personal computers, refer to another MIT-licensed project [xiaozhi-esp32-server](https://github.com/xinnan-tech/xiaozhi-esp32-server)
146+
- [huangjunsen0406/py-xiaozhi](https://github.com/huangjunsen0406/py-xiaozhi) Python client
147+
- [TOM88812/xiaozhi-android-client](https://github.com/TOM88812/xiaozhi-android-client) Android client
142148

143149
## Star History
144150

0 commit comments

Comments
 (0)