The OpenClaw Problem, or: How I Stopped Wrapping Docker in VMs and Built a One-Command Setup Instead

The OpenClaw Problem, or: How I Stopped Wrapping Docker in VMs and Built a One-Command Setup Instead

# ai# macos# devops# opensource

Eugene Kovshilovsky

A continuation of 128GB of RAM, Zero Internet, and a Year of Building AI Infrastructure Nobody Asked...

A continuation of 128GB of RAM, Zero Internet, and a Year of Building AI Infrastructure Nobody Asked For

When I shipped headless agent mode for Cloister, I thought I'd solved the hard problem. VM isolation, consent policies, credential forwarding — all the security layers you'd want between an autonomous AI agent and your actual life. The next step was obvious: run OpenClaw inside it.

OpenClaw is an open-source AI agent framework — a persistent assistant that connects to Telegram, WhatsApp, iMessage, and runs tasks on your behalf. It has a gateway (the brain), nodes (execution endpoints), and a sandbox for running arbitrary code. It's powerful. It's also the kind of software whose community has been refreshingly honest about: don't run it on bare metal.

So I didn't. I ran it inside a Cloister VM. Inside Docker. Inside a Linux VM. On macOS.

If that sentence made you wince, you're ahead of where I was.

Three Security Bypasses and a Caddy Proxy

The first sign something was architecturally wrong came when I tried to access OpenClaw's Control UI through an SSH tunnel. Three things broke, each requiring a security bypass:

CSP Header. OpenClaw's CSP hashes didn't match its actual inline scripts. The UI rendered as a blank page. Fix: a Caddy reverse proxy to strip and replace the header. Not ideal.
Device Auth. Docker NAT meant the gateway saw 172.18.x.x instead of 127.0.0.1, triggering device pairing over HTTPS — which SSH tunnels don't provide. Fix: dangerouslyDisableDeviceAuth: true. The flag's name judged me every time I opened the config.
Gateway Bind. Loopback binding + Docker port publishing don't mix. Fix: gateway.bind: lan, which is exactly what the security docs tell you not to do.

I rationalized all three: "The VM has three isolation layers. Even with bypasses, the blast radius is contained." Technically true. Also a Rube Goldberg machine where each layer existed to compensate for the layer below it.

Then I found OpenClaw's Lume documentation.

The "Wait, They Already Solved This" Moment

OpenClaw has a docs page titled "macOS VM." I'd skimmed past it. When I actually read it:

Lume runs macOS VMs on Apple's Virtualization.framework — same hypervisor tech Cloister uses through Colima for Linux, but with macOS guests. The key differences that made me rethink everything:

No Docker. OpenClaw installs natively. It has its own sandbox — the one I was redundantly wrapping in another Docker layer.
No CSP bypass. Gateway runs on actual localhost. No NAT, no proxy.
No auth bypass. Connections come from real localhost. Device pairing just works.
No bind hack. Gateway stays on loopback. Everything works as documented.
iMessage access. Lume VMs are real macOS — they can run BlueBubbles for iMessage bridging. Linux VMs can't. Ever.

Five problems solved by removing a layer of abstraction. I chose the option that meant admitting the layer I'd built was wrong: abstract the VM backend so Cloister supports both. Colima for Linux workloads, Lume for macOS. The user says --openclaw, Cloister picks Lume. The user creates a regular profile, Cloister picks Colima. Nobody needs to know which hypervisor is underneath.

The Installation Gauntlet

If you've tried setting up a Lume macOS VM yourself, you've probably already rage-quit once. Here's what actually happens:

The IPSW download. Lume creates a macOS VM from Apple's ~18GB restore image. It downloads to a temp directory. If the setup fails — and it will — the download is gone. Next attempt? Another 35 minutes on a decent connection. I built IPSW caching into Cloister so it downloads once, keeps it at ~/.cloister/cache/ipsw/, and reuses it across every attempt and every profile. The kind of feature nobody thanks you for until they don't have it.

The version mismatch. Apple's Virtualization.framework refuses to install a macOS guest newer than the host. It downloads the 18GB IPSW, starts the install, and rejects it after 25 minutes. I added a preflight check that parses the IPSW filename version, compares against sw_vers, and fails in five seconds instead of twenty-five minutes.

The Setup Assistant. Lume automates the macOS Setup Assistant by clicking through 167 UI elements via VNC. On macOS 26.4, Apple renamed a button. Step 66 of 167 fails. Every time. This is an upstream Lume issue — if you're doing it yourself, you'll need to create the VM without --unattended and click through setup manually via VNC. Five minutes by hand. Not ideal, but you only do it once per base image.

The base image. Cloister solves this with a shared base image that every OpenClaw profile clones from. First profile: 15-20 minutes. Every subsequent one: about 2 minutes. Plus snapshots — factory (after provisioning) and user (your fully configured state) — so cloister reset gets you back to working in under two minutes.

If you're setting this up without Cloister, budget an afternoon and a decent internet connection. Or save yourself the trouble.

The Real Problem: Nobody Knows Where to Start

The Lume setup gets you a blank macOS VM. Congratulations. Now you need to:

Configure passwordless sudo
Install Xcode Command Line Tools, Homebrew, Node.js, Playwright, OpenClaw
Run the OpenClaw daemon onboard with the right flags
Set up a Telegram bot via BotFather
Figure out your Telegram user ID (hint: message @userinfobot)
Configure Ollama as a provider — but first figure out the VM's bridge gateway IP
Set up Google OAuth for Gmail, Calendar, Drive — but the OAuth callback goes to localhost, and you're inside a VM, so you need an SSH tunnel
Deal with the macOS keychain password that drifted from the user password during headless setup
Register the node host with OpenClaw's device pairing system
Approve the device — but not with --all because that flag doesn't exist, it's --latest

Each of these has at least one gotcha that'll cost you 30 minutes of Googling — or 30 seconds of asking ChatGPT, which will confidently hallucinate the wrong config field name and cost you an hour. Some have gotchas that'll cost you a day.

I know because I hit every single one of them.

`cloister setup openclaw`

So I built a wizard.

cloister create --openclaw my-oc    # Create the Lume VM (clones from base image)
cloister setup openclaw my-oc       # One command. Handles everything.

The wizard runs five sections in order. Each one writes config and credentials immediately — never batches at the end. If your WiFi drops halfway through, re-running the wizard picks up from the last completed step.

Credentials. Detects 1Password CLI. If found, stores everything there with Touch ID. If not, asks if you want to install it or use local encrypted storage. Generates a new keychain password (because the headless setup almost certainly left it in a state where the old one doesn't work), stores it before applying it to the VM (credential sync invariant — never change a password you haven't recorded), and stores the VM user credentials.

Channels. Walks you through BotFather for Telegram — or accepts an existing bot token. Collects your user ID with instructions to message @userinfobot. Writes the config with dmPolicy: allowlist and allowFrom locked to your Telegram ID. I initially used allowedUserIds because that's what seemed logical. OpenClaw rejected it. The actual field is allowFrom. The kind of thing where you stare at a "Config invalid" error for twenty minutes before realizing the docs and the schema disagree.

WhatsApp gets configured as an action-only channel — OpenClaw can send you notifications, but nobody can issue commands through it. Safe defaults: allowCommands: false, allowFrom locked to your number, escalationPolicy: deny. Because the last thing you want is someone in a WhatsApp group accidentally telling your AI agent to rm -rf /.

Providers. Auto-detects Ollama on the VM's bridge gateway IP. Lists available models. Offers to pull qwen3:32b if nothing's loaded. Then asks the real question: local inference (free, fan sounds like a jet engine) or Anthropic Claude (costs money, quiet). Both can be registered simultaneously — you just pick a default. The wizard handles the API key collection and 1Password storage.

Google OAuth. This one was fun. gog (the Google CLI) needs to run inside the VM, but the OAuth callback redirects to localhost — which is the VM's localhost, not your Mac's. The wizard picks a random high port (49152–60999, collision-checked), sets up an SSH tunnel, runs gog auth add with --listen-addr 0.0.0.0:<port>, and the callback routes through the tunnel back to gog. You just paste the URL in your browser and click Allow.

Except gog's file-backed keyring needs a password to encrypt tokens, and there's no TTY available for the prompt. The wizard sets GOG_KEYRING_PASSWORD to the keychain password it already stored. Another 45 minutes of debugging, now a single environment variable in the setup flow.

Device Pairing. Registers the node host, writes trusted proxy config, approves pending devices, and runs openclaw gateway probe to verify the whole thing is actually healthy — WebSocket connectivity, RPC health, auth warnings. Not just "is the process running" but "can we actually talk to it."

The whole flow takes about 3 minutes on a running VM. Non-interactive mode works too:

cloister setup openclaw my-oc \
  --telegram-token="BOT_TOKEN" \
  --telegram-user-id="USER_ID" \
  --default-provider=ollama \
  --ollama-model=qwen3:32b \
  --google-client-secret=~/client_secret.json \
  --google-email="you@gmail.com"

And for AI agents that want to discover what's configurable:

cloister setup openclaw my-oc --list-options --json

The Config Serialization Bug That Almost Ate Everything

While building the wizard, I discovered that older versions of Cloister (the one installed via Homebrew) would silently drop the backend: lume field when saving config. Go's yaml.v3 library ignores struct fields it doesn't recognize during unmarshal, so the old binary would load the config, not understand backend, and write it back without it. Next time the new binary loaded the config, it would default to Colima and everything would break.

The fix was three things: remove omitempty from the Backend field (always serialize it), add a config version number (so we can detect and migrate), and rotate config.yaml to config.yaml.prev before every save. If anything goes wrong, your previous config is one file rename away. Simple, boring, correct.

The Agent Command That Had to Die

The original Cloister had a whole agent subcommand tree — agent start, agent stop, agent status, agent logs, agent forward. This made sense when OpenClaw ran inside Docker inside a Linux VM. The VM was one thing, the Docker container was another, and you needed separate controls for each.

With Lume, the VM is the agent. OpenClaw runs as a launchd service, not a Docker container. There's no separate lifecycle to manage. So I replaced the entire agent subcommand tree with unified profile commands: cloister status shows everything (both backends, VM state, gateway health), cloister logs tails the gateway log, cloister stop stops the VM. One interface, regardless of what's running underneath.

The old cloister agent command now prints a polite deprecation message pointing to the new commands. Because breaking people's muscle memory without a breadcrumb trail is rude.

What I Learned (Again)

When I hit the Docker networking problems, my instinct was to add complexity. A proxy for the CSP. A flag to disable auth. A bind override. Each solution was individually defensible. Together, they formed a tower of compensating controls that existed only because the foundational decision — running a macOS-native application inside Docker inside Linux inside macOS — was wrong.

The Lume approach doesn't need any of those fixes because it doesn't create any of those problems.

Removing abstraction layers is harder than adding them. You have to admit the layer you built was wrong. You have to resist the voice that says "but it works fine with the three workarounds." It does work fine. But when you're building infrastructure that autonomous AI agents run on, "fine" is measured in security bypasses you hope nobody exploits.

The setup wizard is the part I'm most satisfied with. Not because the code is clever — it's mostly prompts, SSH commands, and state tracking — but because it encodes every gotcha, every undocumented field name, every TTY password issue, every keychain drift problem into a flow that takes 3 minutes instead of a day. The next person who wants to run OpenClaw in an isolated macOS VM doesn't need to learn any of this. They just run one command.

That's what infrastructure is supposed to do: absorb complexity so the user doesn't have to.

ekovshilovsky / cloister

Run multiple Claude Code accounts & AI agents on one Mac — isolated macOS VMs with secure sandboxing

Cloister: Isolated VMs for AI Agents & Multi-Account Claude Code on macOS

Isolated macOS VM environments for running multiple Claude Code organizations, sandboxing autonomous AI agents like OpenClaw, and separating credentials across client engagements.

Why cloister?

Multi-account isolation. Claude Code stores credentials, conversation history, and project config in ~/.claude. If you work across multiple organizations, every session shares the same identity. cloister gives each account its own isolated VM with separate credentials, CLAUDE.md, and conversation history.

Autonomous agent containment. AI agents like OpenClaw run 24/7 with shell access, browser control, and cron scheduling. cloister's VM isolation is stronger than Docker (separate kernel, not just namespace isolation) — services inside the VM are unreachable from the host unless explicitly tunneled, and cloister stop is an instant kill switch.

Dual-backend architecture. Colima (Linux VMs) for Claude Code isolation and Docker workloads. Lume (macOS VMs via Apple Virtualization Framework) for OpenClaw…

cloister — Isolated VM environments for AI coding agents and multi-account separation. Dual-backend: Colima (Linux) and Lume (macOS). One-command OpenClaw setup.

op-forward — Forward 1Password CLI across SSH boundaries with biometric authentication.