Not just another chatbot. A lightweight personal software platform powered by AI — controlling your PC, phone, and smart glasses from one hub.
"This is not an LLM. It's software that utilizes the LLM."
AI Butler is a personal software platform for Windows desktops that uses artificial intelligence as its engine — not its product. While competitors build chatbots, AI Butler builds a software ecosystem where AI is one of many tools that help you get things done.
The platform is free at its core — local AI (Ollama), voice control (Edge TTS + Whisper), smart browser, and task management cost zero. For users who want cloud-grade AI (Alibaba Qwen), the platform offers pay-per-use access at 2x cost with 30% consumer rewards through referrals.
What makes AI Butler unique is its multi-device architecture: the PC is the brain, while phones (WeChat Mini Program) and smart glasses (Alibaba S1 via MCP) serve as input/output endpoints. One AI, one account, all your devices.
Today's AI assistants force users to choose: powerful but expensive (ChatGPT Plus at $20/mo), or free but limited (basic tools without intelligence). Furthermore, every AI assistant is essentially a chatbot — it answers questions but cannot take real actions on your behalf.
Current AI products are language models wrapped in chat interfaces. You ask, they answer. But real productivity requires action: creating calendar events, comparing prices, generating images, controlling devices. A chatbot that can't execute is just a fancy search engine.
Your phone has Siri, your PC has Cortana, your browser has ChatGPT. None of them talk to each other. Your AI context is fragmented across devices. There is no unified personal AI that follows you from desktop to phone to glasses.
ChatGPT Plus costs $20/month (~¥145). Claude Pro costs $20/month. Kimi Pro costs ¥59/month. For the average Chinese user, spending ¥100+ per month on AI is not justified for occasional use. The industry needs a ¥1 tier.
"The foundation is free, but the extensibility is incredible."
AI Butler is a desktop application that happens to use AI, not an AI that happens to have a desktop interface. The distinction matters:
| Chatbot Approach | Software Approach (AI Butler) |
|---|---|
| User asks, AI answers | User states intent, system executes |
| Text in, text out | Text/voice/image in, actions out |
| Single device (web/app) | Multi-device hub (PC + Phone + Glasses) |
| Cloud-dependent | Local-first, cloud-enhanced |
| $20/month minimum | Free base, ¥1/month cloud |
The AI engine (DashScope Qwen) is like the engine in a car — essential but invisible. What the user sees is a personal productivity platform with a character-driven UI, voice interaction, smart browsing, task management, media creation, and device control.
Each layer is independent and extensible. Layer 0-1 provide the foundation. Layer 2 powers intelligence. Layer 3 enables action. Layer 4 adds capabilities. Layer 5 connects devices.
The multi-agent architecture transforms AI Butler from a chatbot into software. Instead of just generating text, the AI can call functions — creating todos, generating images, comparing prices, opening URLs, and controlling devices.
A single Router Agent receives all user input and dispatches to the appropriate tools via DashScope's function calling API. In v4.0, one agent handles everything. As tools grow beyond 20, specialized agents (Secretary, Analyst, Creator, Controller) are activated.
| Agent | Tools | Examples |
|---|---|---|
| Secretary | create_todo, set_reminder, search_calendar, send_notification | "Add grocery shopping to my todo list" |
| Analyst | web_search, analyze_image, summarize_document, price_compare | "Compare AirPods prices on Taobao" |
| Creator | generate_image, generate_video, text_to_speech, clone_voice | "Create a poster for my shop" |
| Controller | open_url, run_skill, control_device, capture_screen | "Open Taobao and search for headphones" |
v4.0 1 Router Agent + ~15 tools. AI selects the right tool automatically.
v4.1 Specialized agents activated when tools exceed 20. Intent classification first, then delegation.
v4.2 Multi-agent collaboration. Secretary checks calendar, Analyst researches, Controller sets reminders — all in one conversation.
"Control your glasses, phone, and computer — all through one butler."
All device communication uses a unified JSON protocol:
{
"bdp_version": "1.0",
"device_type": "pc | phone | glasses | web",
"msg_type": "intent | response | notification | sync",
"payload": { ... },
"auth": { "token": "jwt" }
}
| Channel | Protocol | Target |
|---|---|---|
| LOCAL | PyQt5 Signals | PC (native widget) |
| WEBSOCKET | WSS | Phone / Web client |
| MCP | BLE + Wi-Fi Direct | S1 Smart Glasses |
| HTTP | HTTPS REST | Generic IoT / 3rd party |
Instead of a native app, AI Butler connects to phones via a WeChat Mini Program — zero installation for 1.3 billion WeChat users. Chat sync, todo sync, file transfer, voice commands, and referral sharing are all built in.
The PC is the brain. S1 glasses are sensors and display endpoints. Camera frames flow to DashScope Vision AI via the PC. Microphone audio goes through local STT. Responses return as TTS audio and display overlays. Same DashScope API key, zero additional cost.
DashScope (百炼) by Alibaba provides 6 AI services under a single API key:
| Service | Model | Cost (Platform) | User Price (2x) |
|---|---|---|---|
| AI Chat (Deep Thinking) | Qwen3-Max | ¥2.5~10/1M tok | ¥5~20/1M tok |
| AI Chat (Economy) | Qwen-Turbo | ¥0.3~1.2/1M tok | ¥0.6~2.4/1M tok |
| Image Generation | Wan2.6-T2I | ¥0.2/image | ¥0.4/image |
| Video Generation (720p) | Wan2.6-T2V | ¥0.6/sec | ¥1.2/sec |
| Text-to-Speech | Qwen3-TTS | ¥0.8/10K chars | ¥1.6/10K chars |
| Vision Analysis | Qwen-VL-Plus | ¥2~5/1M tok | ¥4~10/1M tok |
| Stream | Type | Price | Margin |
|---|---|---|---|
| Subscription | Fixed monthly/yearly | ¥1/key/mo or ¥10/key/yr | ~100% |
| Booster | Pay-per-use AI | 2x DashScope cost | ~20% |
| Products | One-time purchase | ¥5~20 | 60~100% |
Local AI (Ollama), Edge TTS, Whisper STT, basic character, todos, browser, 3 languages — all permanently free. The free tier is a fully functional local AI assistant that runs on the user's hardware at zero cost to the platform.
Every booster payment returns 30% to the ecosystem: 20% to the direct referrer, 10% to the second-level referrer. Earnings auto-reload into booster balance, creating a "free AI" experience for active referrers.
Free local AI (hook) → Quality gap (try booster) → Top-up ¥10 → Referral rewards (share) → Auto-reload balance ("free AI" feeling) → Multi-device stickiness (can't leave) → More referrals → Viral growth
| Platform | Free Tier | Paid Tier | Our Advantage |
|---|---|---|---|
| ChatGPT Plus | Limited GPT-4o-mini | $20/mo (~¥145) | 1/145 price, desktop-first |
| Tongyi Qianwen | Daily limits | ~¥19/mo | Unlimited local AI + multi-device |
| DeepSeek | Free chat | ¥0.14/1M tok | Complete software (UI+voice+skills) |
| Doubao (ByteDance) | Daily limits | Subscription | Desktop + phone + glasses |
| Kimi (Moonshot) | Free long-context | ¥59/mo | Function calling + local AI fallback |
AI Butler is the only desktop-first AI platform that combines local AI, cloud AI, multi-agent function calling, and a 3-device ecosystem (PC + Phone + Glasses) — at 1/145th the price of alternatives.
SSE streaming responses, balance top-up UI, 3-tier model selection, dark mode. Users feel "real AI".
DashScope tools parameter, tool registry, skill-to-tool conversion, client-side tool execution. AI takes actions.
Image/video generation UI, WeChat Pay integration, subscription status dashboard, referral link generation.
WebSocket endpoint, device registry, BDP message router, context sync engine, push notifications.
Phone connected. Chat sync, todo sync, file transfer, referral sharing — all via WeChat.
S1 MCP adapter, vision/audio pipeline, skill marketplace UI, developer SDK, ad integration.
| Component | Technology |
|---|---|
| Desktop Client | Python 3.11 + PyQt5 |
| Server Backend | FastAPI + PostgreSQL + Uvicorn |
| AI Engine (Cloud) | DashScope (Alibaba) — Qwen, Wan, TTS, VL |
| AI Engine (Local) | Ollama (llama3, qwen2, mistral) |
| Voice | Edge TTS (free) + DashScope TTS (premium) + Whisper STT |
| Authentication | WeChat OAuth 2.0 + JWT |
| Payment | Alipay RSA2 + WeChat Pay V3 |
| Phone Client | WeChat Mini Program (WXML/WXSS/JS) |
| Device Protocol | BDP v1.0 (WebSocket + MCP) |
| Installer | PyInstaller + InnoSetup |