Remote OpenClaw

Remote OpenClaw Blog

What Is OpenShell? OpenClaw's Pluggable Sandbox Backend Explained

Published: ·Last Updated:
What changed

This post was reviewed and updated to reflect current deployment, security hardening, and operations guidance.

What should operators know about What Is OpenShell? OpenClaw's Pluggable Sandbox Backend Explained?

Answer: OpenShell is the pluggable sandbox backend system introduced in OpenClaw version 3.22. Before OpenShell, OpenClaw had a single, tightly coupled approach to code execution — everything ran inside the same Docker container as the agent itself. This was simple but created security and flexibility problems that became increasingly apparent as operators deployed OpenClaw in production environments. This guide.

Updated: · Author: Zac Frulloni

OpenShell is OpenClaw's pluggable sandbox backend introduced in 3.22. Learn what it does, why sandboxing matters for AI code execution, how to configure SSH backends, and security benefits.

Marketplace

Free skills and AI personas for OpenClaw — deploy a pre-built agent in 15 minutes.

Browse the Marketplace →

Join the Community

Join 500+ OpenClaw operators sharing deployment guides, security configs, and workflow automations.

What Is OpenShell?

OpenShell is the pluggable sandbox backend system introduced in OpenClaw version 3.22. Before OpenShell, OpenClaw had a single, tightly coupled approach to code execution — everything ran inside the same Docker container as the agent itself. This was simple but created security and flexibility problems that became increasingly apparent as operators deployed OpenClaw in production environments.

OpenShell solves this by creating an abstraction layer between the AI agent and the execution environment. Instead of code running directly on the host or inside the main container, it runs through a sandbox backend that you choose and configure. The agent sends code to OpenShell, OpenShell routes it to the configured backend, and the backend executes it in isolation.

Think of OpenShell as a switchboard for code execution. The AI generates code, but where that code actually runs is entirely up to you. You can route it to a Docker container, an SSH remote host, or a custom backend that you build yourself. The agent does not know or care which backend is handling execution — it just sends code and receives output.

This architecture was one of the most requested features in the OpenClaw community, particularly from operators running in regulated environments where code execution must happen on specific, audited infrastructure.


Why Does Sandboxing Matter for AI Code Execution?

AI agents generate and execute code based on language model outputs. This is fundamentally different from traditional software where a human developer writes, reviews, and tests every line before it runs. With an AI agent, code is generated dynamically in response to user requests, and it can be unpredictable.

Without sandboxing, AI-generated code runs with the same permissions as the OpenClaw process itself. On many deployments, that means root access inside a Docker container, or worse, direct access to the host filesystem. A single malformed command could:

  • Delete files — including configuration files, data, or even the agent itself
  • Exfiltrate data — sending sensitive information to external endpoints
  • Consume resources — infinite loops, fork bombs, or excessive disk writes
  • Modify the network — opening ports, changing firewall rules, or connecting to internal services
  • Install software — adding packages or binaries that create persistent backdoors

Sandboxing isolates code execution from the host system. Even if the AI generates a destructive command, the damage is contained within the sandbox. The sandbox can be destroyed and recreated without affecting the host, the agent, or any persistent data.

This is not a theoretical concern. Community operators have reported cases where AI-generated code accidentally overwrote configuration files, ran commands that consumed all available disk space, or made unintended API calls. Sandboxing prevents these incidents from becoming catastrophic.


What Sandbox Backends Are Available?

OpenClaw 3.22 ships with three sandbox backend options:

1. Docker (default)

The Docker backend spins up a separate container for code execution. It uses the same Docker daemon as OpenClaw but creates an isolated container with its own filesystem, network namespace, and resource limits. This is the default and works well for most deployments.

SANDBOX_BACKEND=docker
SANDBOX_DOCKER_IMAGE=openclaw/sandbox:latest
SANDBOX_DOCKER_MEMORY=512m
SANDBOX_DOCKER_CPUS=1

2. SSH

The SSH backend executes code on a remote machine via SSH. This is ideal for operators who want code execution to happen on a dedicated, hardened server that is separate from the machine running OpenClaw. The remote machine can have its own firewall rules, resource limits, and security policies.

SANDBOX_BACKEND=ssh
SSH_HOST=sandbox.example.com
SSH_USER=sandbox
SSH_KEY_PATH=/keys/sandbox_key
SSH_PORT=22

3. None (not recommended)

Setting the backend to none disables sandboxing entirely. Code executes directly in the OpenClaw process context. This is only appropriate for development and testing — never for production.

SANDBOX_BACKEND=none

The architecture is extensible. OpenShell defines a standard interface that custom backends must implement, so you can build your own backend for specialized environments (Kubernetes pods, Firecracker microVMs, cloud functions, etc.).


Marketplace

4 AI personas and 7 free skills — browse the marketplace.

Browse Marketplace →

How Do You Configure the SSH Sandbox Backend?

The SSH backend is the most popular new option because it lets you physically separate code execution from the agent. Here is the complete setup process:

Step 1: Prepare the remote host.

Set up a machine (VPS, dedicated server, or VM) that will serve as the sandbox. Install any tools the AI might need (Python, Node.js, git, etc.). Create a dedicated user with limited permissions:

sudo useradd -m -s /bin/bash sandbox
sudo mkdir -p /home/sandbox/.ssh
# Set resource limits in /etc/security/limits.conf
echo "sandbox hard nproc 100" | sudo tee -a /etc/security/limits.conf
echo "sandbox hard nofile 1024" | sudo tee -a /etc/security/limits.conf

Step 2: Generate and deploy SSH keys.

ssh-keygen -t ed25519 -f sandbox_key -N ""
# Copy public key to remote host
ssh-copy-id -i sandbox_key.pub sandbox@sandbox.example.com

Step 3: Configure OpenClaw.

Add the following to your .env file or docker-compose.yml:

SANDBOX_BACKEND=ssh
SSH_HOST=sandbox.example.com
SSH_USER=sandbox
SSH_KEY_PATH=/keys/sandbox_key
SSH_PORT=22
SSH_TIMEOUT=30

Step 4: Mount the key in Docker.

If running OpenClaw in Docker, mount the SSH key as a volume:

volumes:
  - ./sandbox_key:/keys/sandbox_key:ro

Step 5: Test the connection.

Restart OpenClaw and check the logs for a successful sandbox connection. You should see OpenShell: SSH backend connected to sandbox.example.com in the startup output.


What Are the Security Benefits?

OpenShell with a properly configured sandbox backend provides several layers of security:

  • Process isolation: Code runs in a separate process space with its own memory, file descriptors, and network stack. A crash or exploit in the sandbox does not affect the agent.
  • Filesystem isolation: The sandbox has its own filesystem. Even if AI-generated code tries to read /etc/passwd or modify system files, it only sees the sandbox's filesystem.
  • Network isolation: Docker backends can restrict network access entirely or limit it to specific hosts. SSH backends inherit the remote host's firewall rules.
  • Resource limits: Both Docker and SSH backends support CPU, memory, and process count limits. An infinite loop in the sandbox is killed when it hits the limit, not when it crashes your server.
  • Ephemeral execution: Sandboxes can be reset between executions. Any files created, packages installed, or state changes are wiped clean. This prevents persistent compromise.
  • Audit trail: All code sent to the sandbox is logged by OpenShell before execution. You can review exactly what the AI tried to run, regardless of the outcome.

For operators in regulated industries, the SSH backend is particularly valuable because it allows code execution to happen on infrastructure that is separately audited, monitored, and compliant with specific security standards.


How Do You Troubleshoot Sandbox Issues?

Common sandbox problems and their fixes:

"Sandbox connection failed" — For SSH backends, verify the host is reachable, the SSH key has correct permissions (600), and the user can actually log in. Test manually with ssh -i /path/to/key sandbox@host.

"Sandbox timeout exceeded" — The code took too long to execute. Increase SSH_TIMEOUT or SANDBOX_DOCKER_TIMEOUT, or investigate why the code is slow (large downloads, infinite loops, etc.).

"Permission denied in sandbox" — The sandbox user does not have permission to run the requested command. Check the remote user's permissions and ensure required tools are installed and accessible.

"Docker sandbox failed to start" — The Docker daemon may be out of resources. Check docker ps for orphaned sandbox containers, prune old images, and verify disk space.

Code works locally but fails in sandbox — The sandbox environment is different from your host. Ensure the sandbox has the same tools, language runtimes, and environment variables that the code expects.

Enable debug logging with LOG_LEVEL=debug to see detailed OpenShell output, including the exact commands sent to the sandbox and the raw output returned.