Remote OpenClaw Blog

OpenClaw iOS App: Canvas, Camera, and Voice Wake Setup Guide

7 min read · 1 March 2026

What the iOS App Does

The OpenClaw iOS app is a companion client for your OpenClaw agent. It is not a standalone agent that runs on your iPhone. Your agent continues to run on your server (VPS, dedicated server, or Mac). The iOS app provides a dedicated mobile interface for interacting with that remote agent.

Think of it this way: your OpenClaw agent is the brain, running on your server 24/7. The iOS app is one of several ways to talk to that brain — alongside WhatsApp, Telegram, email, and the web UI. What makes the iOS app unique is its three exclusive features: Canvas, Camera, and Voice Wake. These features are not available through messaging apps.

The app is free on the App Store and works with any OpenClaw instance running version 3.22 or later. It communicates with your server over a secure WebSocket connection through the OpenClaw Gateway API.

Connecting to Your Gateway

To use the iOS app, you need to enable the Gateway API on your OpenClaw instance. Add these variables to your .env file:

# Enable the Gateway API for mobile connections
OPENCLAW_GATEWAY_ENABLED=true

# Gateway port (default: 3001, separate from the main web UI port)
OPENCLAW_GATEWAY_PORT=3001

# Authentication token (generate a strong random string)
OPENCLAW_GATEWAY_TOKEN=your-random-token-here

# Optional: restrict to specific IP ranges
# OPENCLAW_GATEWAY_ALLOWED_IPS=*

Restart OpenClaw after adding these variables. Then, in the iOS app:

Open the app and tap "Connect to Server"
Enter your server address: wss://your-server-domain:3001 (or use the IP address)
Enter the gateway token you set in OPENCLAW_GATEWAY_TOKEN
Tap "Connect" — the app should show a green "Connected" status

Important: For security, always use WSS (WebSocket Secure) rather than plain WS. This requires an SSL certificate on your server. If you are using a reverse proxy like Nginx or Caddy, route the gateway port through your existing SSL setup. If you are connecting over a local network (same Wi-Fi), plain WS is acceptable for testing.

If the connection fails, common causes are:

Firewall blocking port 3001 (open it in your VPS firewall)
Wrong gateway token (copy-paste it exactly, no trailing spaces)
Gateway not enabled (check the logs for "Gateway listening on port 3001")
SSL misconfiguration (try plain WS first to isolate the issue)

Canvas: The Visual Workspace

Canvas is the iOS app's standout feature. It provides a full-screen visual workspace where the agent can display rich content that goes beyond what messaging apps support.

What Canvas can display:

Charts and graphs: Revenue trends, pipeline funnels, budget comparisons — rendered as interactive charts you can zoom and scroll.
Tables: Structured data like CRM contacts, invoice lists, or task boards with sortable columns.
Formatted documents: Reports, proposals, and summaries with headings, bullet points, and proper formatting.
Images and media: Generated images, screenshots, or photos with annotations.
Interactive elements: Buttons for quick actions ("Approve this invoice," "Schedule this meeting," "Send this email").

How to use Canvas:

When you ask the agent for something that benefits from visual presentation, it automatically uses Canvas. For example:

"Show me this month's revenue by client" — renders a bar chart on Canvas
"Display my task board" — renders a Kanban-style table on Canvas
"Draft a proposal for Acme Corp" — displays a formatted document on Canvas with an "Email to client" button

You can also explicitly request Canvas output: "Put this on Canvas" or "Show this visually." The agent detects the intent and uses the Canvas API instead of sending a text response.

Canvas content persists in the app until you dismiss it or the agent sends new Canvas content. You can screenshot Canvas outputs, share them, or export them as PDFs (tap the share icon in the Canvas toolbar).

Camera Integration

The iOS app integrates directly with your iPhone's camera, allowing you to send photos to the agent for instant analysis. This uses the multimodal capabilities of AI models like Claude and GPT-4o.

Use cases for camera:

Document scanning: Photograph a business card, receipt, invoice, or handwritten note. The agent reads the text, extracts information, and can add it to your CRM, expense tracker, or memory.
Whiteboard capture: Photograph a whiteboard from a meeting. The agent transcribes the content and creates structured notes.
Product identification: Photograph a product and ask "What is this?" or "Where can I buy this?"
Visual troubleshooting: Photograph an error message on a screen, a physical setup (server rack, office layout), or any visual information you need the agent to analyze.

Best Next Step

Use the marketplace filters to choose the right OpenClaw bundle, persona, or skill for the job you want to automate.

Find Your Workflow →Compare Best Fits →

To use the camera, tap the camera icon in the chat interface. You can take a new photo or select from your photo library. Add a text message for context ("What does this receipt say?" or "Add this business card to my CRM") and send.

Camera integration requires a multimodal AI model (Claude Opus, Claude Sonnet, GPT-4o, GPT-5.4, or Gemini). If your agent is configured with a text-only model, camera inputs will fail. Check your OPENCLAW_DEFAULT_MODEL setting.

Voice Wake Setup

Voice Wake allows you to activate the agent hands-free using a custom wake word, similar to "Hey Siri" or "OK Google" but for your OpenClaw agent.

Setting up Voice Wake:

In the iOS app, go to Settings → Voice Wake
Tap "Enable Voice Wake"
Choose a wake word. The default is "Hey Atlas" but you can customize it to any 2-3 word phrase.
Train the wake word by saying it 5 times when prompted. The app uses on-device speech recognition to learn your voice and pronunciation.
Set the listening mode: "Always" (listens continuously when the app is open) or "Button" (listens only when you press and hold the microphone button)

Once enabled, say your wake word and then your request: "Hey Atlas, what's on my calendar today?" The app converts your speech to text, sends it to the agent, and reads the response aloud using the system text-to-speech engine.

Voice Wake limitations:

Only works when the app is in the foreground. iOS does not allow background always-listening for third-party apps.
Wake word detection happens on-device (private), but the transcribed text is sent to your OpenClaw server for processing.
Accuracy depends on your environment. Noisy environments may trigger false activations or miss the wake word.
Response readback uses iOS text-to-speech, which sounds robotic compared to dedicated voice AI services. This may improve in future iOS updates.

Limitations vs Desktop

The iOS app is a companion, not a replacement for the full OpenClaw experience. Here are the key limitations compared to the desktop web UI:

Feature	iOS App	Desktop Web UI
Conversation	Full support	Full support
Canvas	Full support (exclusive)	Not available
Camera	Full support (exclusive)	File upload only
Voice Wake	Full support (exclusive)	Not available
Agent configuration	Not available	Full support
Environment variables	Not available	Via server access
Log viewing	Basic (last 100 entries)	Full log viewer
Skill management	Not available	Full support
Multi-agent management	Switch between agents	Full configuration
Memory file editing	Not available	Via server access
Background operation	Limited (iOS restrictions)	Always available

The iOS app is for interacting with your agent on the go. All configuration, management, and administration should be done through the desktop web UI or direct server access.

WhatsApp and Telegram as Alternatives

You do not need the iOS app to use OpenClaw on your phone. WhatsApp and Telegram work as excellent mobile interfaces and have some advantages over the dedicated app.

Advantages of WhatsApp/Telegram over the iOS app:

Always available: WhatsApp and Telegram receive push notifications and work in the background. The iOS app requires you to have it open.
No gateway setup: If you already have WhatsApp or Telegram connected to OpenClaw, there is nothing additional to configure.
Familiar interface: You already know how to use WhatsApp and Telegram. No new app to learn.
Group support: WhatsApp groups work natively. The iOS app does not have group functionality.
Sharing: Forwarding agent responses to others is easier in messaging apps.

Advantages of the iOS app over WhatsApp/Telegram:

Canvas: Rich visual content is not possible in messaging apps. If you need charts, tables, and interactive elements, the iOS app is the only option.
Camera integration: While you can send photos via WhatsApp, the iOS app's camera integration is faster and more seamless, with dedicated UI for photo analysis.
Voice Wake: Hands-free activation is not available through messaging apps.
Dedicated experience: The iOS app is designed specifically for OpenClaw interaction, with shortcuts, quick actions, and agent-specific features.

For most mobile users, WhatsApp or Telegram is sufficient and simpler. The iOS app is worth installing if you regularly need Canvas for visual content, use the camera frequently for document scanning, or want Voice Wake for hands-free operation. Many operators use both — WhatsApp for quick messages and notifications, the iOS app for Canvas and camera when they need richer interaction.

Ready to choose the right OpenClaw workflow?

Best Next StepUse the marketplace filters to choose the right OpenClaw bundle, persona, or skill for the job you want to automate.More GuidesBrowse 200+ free OpenClaw guides, tutorials, and comparisons.Get the Production ChecklistUse the free checklist if you want the production setup sequence in one place.

Loading article