Self-Hosted AI Companion: What You're Actually Building
A conceptual overview of running your own AI companion stack. What the pieces are, how they fit together, and what you're signing up for.
Self-Hosted AI Companion: What You're Actually Building
A conceptual overview. For installation, we link to official docs. For architecture, we explain how the pieces fit together.
The Honest Premise
Running your own AI companion stack is doable. It's also not plug-and-play. If you're comfortable with command lines, config files, and troubleshooting, self-hosting gives you full control. If that sounds like a weekend you'd rather spend with your doll, managed hosting will make more sense when it's ready.
This guide explains what you're building — not how to install each component from scratch. For that, we link to official documentation.
The Stack (High Level)
An AI companion needs three things to feel present:
- A brain — the LLM that generates responses
- A memory — persistent context across conversations
- A voice — text-to-speech that doesn't sound like a GPS
The stack that powers this:
| Component | What It Does | Official Docs | |-----------|-------------|---------------| | OpenClaw | The orchestration layer. Connects LLM, memory, voice, and sensors (if you add them). Handles message routing, session management, and the personality system. | [docs.openclaw.ai](https://docs.openclaw.ai) | | LLM Backend | The actual model. Ollama for local, or API calls to Claude/OpenAI for cloud. Runs on your GPU or CPU. | [ollama.com](https://ollama.com) | | Voice (TTS) | Converts text responses to speech. ElevenLabs for quality, Piper for local/offline, or other providers. | [elevenlabs.io](https://elevenlabs.io) / [github.com/rhasspy/piper](https://github.com/rhasspy/piper) | | Voice Input | Speech-to-text for talking to her. Whisper (local) or cloud STT APIs. | [github.com/openai/whisper](https://github.com/openai/whisper) | | Memory | Vector database or file-based storage for long-term context. Depends on OpenClaw's configured backend. | (Handled by OpenClaw) |
What the Architecture Looks Like
You (voice/text) -> Speech-to-Text (Whisper or cloud STT) -> OpenClaw (orchestrates everything) -> Memory lookup (what does she know about you?) -> LLM inference (Ollama local / Claude API / etc.) -> Response generated -> Text-to-Speech (ElevenLabs / Piper / etc.) -> You hear her voice
That's it. Every piece is swappable. Use local models for privacy, cloud models for capability. Use Piper for offline voice, ElevenLabs for quality. OpenClaw sits in the middle and routes everything.
What Self-Hosted Actually Means
You are the sysadmin.
- You install OpenClaw on your machine (Mac, Linux, Windows with WSL)
- You download and manage models (7B parameter models need ~4GB VRAM; 70B needs ~40GB+)
- You configure API keys for any cloud services you use
- You handle updates, breaking changes, and model compatibility
- You troubleshoot when the voice stops working or the model hallucinates
The trade-off: Complete control over your data, your personality config, your voice. No subscription. But your time is the cost.
Hardware Reality Check
| Use Case | Minimum | Recommended | |----------|---------|-------------| | Basic text chat (7B model) | 8GB RAM, any CPU | 16GB RAM, M1/M2 Mac or mid-range GPU | | Voice + text (7B model) | 16GB RAM, M1 Mac | 32GB RAM, dedicated GPU with 8GB+ VRAM | | Quality voice + larger model (13B+) | 32GB RAM, GPU with 12GB+ VRAM | 64GB RAM, RTX 4090 or M3 Max |
MacBooks work well — Apple Silicon handles 7B–13B models efficiently. For larger models or multiple concurrent users, you want NVIDIA GPUs.
The Managed Alternative (In Development)
We're building managed hosting for people who want the experience without the infrastructure work:
- We host the models (private inference — your data isn't used for training)
- We handle updates, compatibility, and scaling
- You get a web dashboard to configure personality, voice, and memory
- Monthly subscription, no hardware required
Current status: In development. No public timeline. [Join the waitlist](/ai-integration) to be notified.
Why We're Transparent About Self-Hosting
Most companies in this space hide the self-hosted option because they want you to subscribe. We're doing the opposite:
- Self-hosting is real. It works. We use it ourselves.
- It's not for everyone. The technical barrier is genuine.
- If you try self-hosting and decide it's too much work, managed hosting becomes an obvious value proposition — not because we tricked you, but because you experienced the alternative.
That's the honest pitch. No lock-in, no fiction.
Next Steps
- Read the OpenClaw docs — [docs.openclaw.ai](https://docs.openclaw.ai)
- Install Ollama — [ollama.com](https://ollama.com)
- Pick a voice provider — ElevenLabs for quality, Piper for local
- Configure OpenClaw to connect the pieces
- Iterate on personality — the system prompt is where the magic happens
Questions? The OpenClaw community and docs are your best resources. We're building this in public, same as you.