Remote OpenClaw

Remote OpenClaw Blog

OpenClaw 4.5 Update: Video Gen, Music Gen, and Dreaming GA

10 min read ·

OpenClaw 4.5 Update: Video Gen, Music Gen, and Dreaming GA

What Ships in OpenClaw 4.5

OpenClaw 4.5 is the largest feature release since 4.0. It extends the platform beyond text and code into multimodal content creation — video and music generation — while stabilizing the Dreaming memory system that has been in beta since October 2025. The release also broadens the provider ecosystem, adds 12-language support, and patches security vulnerabilities discovered during the 4.4.x cycle.

Here is the headline summary of everything in this release:

Category What Changed
Video Generation xAI, Runway (Gen-3/Gen-4), Wan, ComfyUI backends
Music Generation Google Lyria, MiniMax backends
New LLM Providers Qwen (Alibaba), Fireworks AI, StepFun
Dreaming Graduated from beta to General Availability
Languages 12 languages in CLI and docs (up from 4)
Security Token handling fixes, MCP server validation hardening
Breaking Change Config schema migration via openclaw doctor --fix

Let us walk through each of these in detail.


Video Generation

OpenClaw 4.5 introduces a unified video generation interface that works across four different backends. You describe the video you want in natural language, and OpenClaw routes the request to your configured provider, handles the generation process, and delivers the output file.

xAI (Grok Video)

xAI's video generation model, accessible through the Grok API, offers the fastest generation times of the four backends. A 5-second clip typically generates in under 30 seconds. Quality is good for social media content, product demos, and quick visualizations. The model handles text-to-video and image-to-video inputs.

# Configure xAI video in your OpenClaw config
video:
  provider: xai
  api_key: your-xai-api-key
  default_duration: 5
  default_resolution: 1080p

Runway (Gen-3 Alpha and Gen-4)

Runway remains the gold standard for video quality. Gen-4, released in early 2026, produces remarkably coherent motion and scene transitions. The trade-off is cost — Runway charges per second of generated video, and a 10-second clip at full quality can cost $2-5 depending on resolution and complexity.

# Configure Runway video
video:
  provider: runway
  api_key: your-runway-api-key
  model: gen-4  # or gen-3-alpha for lower cost
  default_duration: 5
  default_resolution: 1080p

Wan (Open-Source Video Diffusion)

Wan is the open-source option. It runs locally on your GPU (minimum 12GB VRAM for decent quality) or through community-hosted inference endpoints. Quality is a step below Runway but the cost is zero if you self-host. For operators generating high volumes of video content, the cost savings are substantial.

# Configure Wan (local)
video:
  provider: wan
  endpoint: http://localhost:8188
  model: wan-2.1
  default_duration: 5

ComfyUI (Custom Workflows)

ComfyUI integration gives advanced operators access to Stable Video Diffusion and custom video generation pipelines. If you already have ComfyUI workflows for image generation, you can extend them to video with minimal configuration. This is the most flexible option but requires the most technical knowledge to set up.

# Configure ComfyUI video
video:
  provider: comfyui
  endpoint: http://localhost:8188
  workflow: video-generation-v2.json

All four backends integrate with OpenClaw's task system. You can schedule video generation jobs, chain them with other tasks (generate video, then post to social media), and use different backends for different quality/cost requirements.


Music Generation

Music generation in OpenClaw 4.5 follows the same unified interface pattern as video. Two backends ship at launch.

Google Lyria

Lyria is Google DeepMind's music generation model, accessible through the Google AI API. It produces high-quality instrumental and vocal tracks from text descriptions. Lyria excels at generating background music, jingles, and ambient audio — exactly the type of content operators need for video projects, podcasts, and social media.

# Configure Lyria music generation
music:
  provider: lyria
  api_key: your-google-ai-api-key
  default_duration: 30  # seconds
  default_format: mp3

Lyria supports genre specification, mood parameters, tempo control, and instrumentation preferences. You can describe "upbeat electronic track, 120 BPM, synth-heavy, suitable for a product launch video" and get a usable result in under a minute.

MiniMax

MiniMax is a Chinese AI company whose music generation model offers strong quality at lower cost than Lyria. It handles both instrumental and vocal generation, with particularly good results for pop, electronic, and cinematic genres. MiniMax also supports lyrics-to-song generation, where you provide lyrics and the model generates a complete vocal track.

# Configure MiniMax music generation
music:
  provider: minimax
  api_key: your-minimax-api-key
  default_duration: 30
  default_format: mp3

The practical difference between Lyria and MiniMax is subtle for most use cases. Lyria produces slightly more polished instrumental tracks; MiniMax handles vocal generation better and costs less. Many operators will want to test both and choose based on their specific content needs.


New LLM Providers: Qwen, Fireworks, StepFun

OpenClaw 4.5 adds first-class support for three new LLM providers, expanding the model ecosystem that operators can draw from.

Qwen (Alibaba Cloud)

Qwen 2.5 and Qwen 3 models are now available as native providers. Qwen has been accessible through OpenRouter for some time, but direct API support means lower latency, better token pricing, and access to Alibaba's free tier for developers. Qwen 2.5-72B is particularly strong for coding tasks and competes well with Claude Haiku on price-performance.

# Configure Qwen directly
llm:
  provider: qwen
  model: qwen-2.5-72b
  api_key: your-dashscope-api-key

Fireworks AI

Fireworks specializes in fast inference for open models. Their infrastructure delivers some of the lowest latency numbers available for models like Llama, Mixtral, and Qwen. Native Fireworks support in OpenClaw means you can use their optimized inference endpoints directly rather than routing through OpenRouter.

# Configure Fireworks AI
llm:
  provider: fireworks
  model: accounts/fireworks/models/llama-v3p3-70b-instruct
  api_key: your-fireworks-api-key

StepFun

StepFun is a newer Chinese AI lab whose Step models have shown strong multilingual performance. Their Step-2 model is competitive with GPT-4-class models on Chinese and East Asian language tasks at significantly lower cost. For operators serving multilingual audiences or processing non-English content, StepFun is a valuable addition.

# Configure StepFun
llm:
  provider: stepfun
  model: step-2-16k
  api_key: your-stepfun-api-key

All three providers are configured through the standard OpenClaw config format. You can mix them with existing providers, assign different models to different tasks, and switch freely based on cost and capability requirements.


Dreaming Graduates to General Availability

Dreaming has been in beta since OpenClaw 4.0, and with 4.5 it officially graduates to General Availability (GA). This means Dreaming is now production-ready, fully supported, and enabled by default in new installations.

For operators unfamiliar with Dreaming: it is OpenClaw's autonomous memory consolidation system. During idle periods (typically overnight), Dreaming processes your agent's recent conversation history and extracts patterns, preferences, decisions, and knowledge. It then writes consolidated insights to MEMORY.md, which becomes persistent context for future sessions.

The GA release includes several improvements over the beta:

Marketplace

Free skills and AI personas for OpenClaw — browse the marketplace.

Browse the Marketplace →
# Enable Dreaming in OpenClaw 4.5 (minimal config)
{
  "dreaming": {
    "enabled": true
  }
}

For a deep dive into how Dreaming works — the three phases (Light, Deep, REM), weighted scoring, and advanced configuration — see our dedicated OpenClaw Dreaming Guide.


12 Language Support

OpenClaw 4.5 expands language support from 4 languages (English, Chinese, Japanese, Korean) to 12. The new additions are:

  • Spanish
  • French
  • German
  • Portuguese
  • Italian
  • Russian
  • Arabic
  • Hindi

Language support covers the CLI interface (commands, help text, error messages), documentation, and the default system prompts used for agent interactions. The underlying LLM capabilities for each language depend on the model you choose — Claude and GPT handle most of these languages well, while some open models have stronger support for specific language families.

To set your language:

# Set language in config
{
  "language": "es"  # Spanish
}

Or via the CLI:

openclaw config set language fr  # French

This is a significant step toward making OpenClaw accessible to the global operator community. The Remote OpenClaw community has seen growing membership from Spanish-speaking and Portuguese-speaking countries in particular, and native-language CLI support removes a meaningful friction point.


Security Fixes

OpenClaw 4.5 patches two security issues discovered during the 4.4.x cycle. Both were responsibly disclosed through the OpenClaw security reporting process.

Token Handling Fix

A vulnerability in how OpenClaw stored and rotated API tokens could, under specific conditions, leave expired tokens in plaintext log files. While the tokens were expired (and therefore not directly exploitable), their presence in logs created a risk if those logs were shared or stored insecurely. The fix ensures all tokens are scrubbed from log output and expired tokens are immediately purged from memory.

MCP Server Validation Hardening

The MCP (Model Context Protocol) server handshake process did not sufficiently validate server identity in certain edge cases. A malicious MCP server on a local network could, theoretically, impersonate a legitimate server and intercept agent communications. The fix adds certificate pinning and stricter server identity validation to the MCP handshake.

Both fixes are applied automatically when you update to 4.5. No manual intervention is required beyond running the update.


Breaking Change: openclaw doctor --fix

OpenClaw 4.5 introduces a new configuration schema that is not backward-compatible with 4.4.x configurations. After updating, you must run:

openclaw doctor --fix

This command does four things:

  1. Backs up your current configuration to ~/.openclaw/config.yaml.bak
  2. Detects schema incompatibilities between your existing config and the 4.5 schema
  3. Migrates your configuration to the new format, preserving all your settings
  4. Validates the migrated configuration and reports any issues that need manual attention

The main changes in the new schema:

  • Video and music provider configurations are now under dedicated top-level keys (video: and music:) rather than nested under the general integrations: key.
  • Dreaming configuration has been simplified from multiple parameters to a single dreaming: block.
  • Provider API keys now support environment variable references (${OPENROUTER_API_KEY}) natively, replacing the previous workaround of using shell expansion.

If you skip openclaw doctor --fix, OpenClaw will start but may behave unexpectedly — features may not load, providers may not authenticate, and Dreaming may not run. Always run the doctor after updating.


Migration Guide: Updating to 4.5

Here is the step-by-step process for updating from any 4.x version to 4.5:

Step 1: Update OpenClaw

# Using the built-in updater
openclaw update

# Or via npm
npm update -g @openclaw/cli

# Or via Homebrew (macOS)
brew upgrade openclaw

Step 2: Run the Doctor

# This is REQUIRED — do not skip
openclaw doctor --fix

Review the output. The doctor will report what it changed and flag anything that needs your attention. In most cases, the migration is fully automatic.

Step 3: Configure New Features (Optional)

If you want to use the new video or music generation features, add the appropriate provider configuration to your config file. See the sections above for examples.

Step 4: Verify

# Check that everything is working
openclaw status

# Run a quick test task
openclaw run "What version am I running?"

If you encounter issues after updating, the first troubleshooting step is always openclaw doctor --fix again — it is idempotent and safe to run multiple times.


Stats: 4 Video Gen Backends; 2 Music Gen Backends; 3 New LLM Providers; 12 Supported Languages
OpenClaw 4.5 by the numbers

Frequently Asked Questions

Is the openclaw doctor --fix command required after updating to 4.5?

Yes. OpenClaw 4.5 includes a breaking change in the configuration schema. After updating, you must run openclaw doctor --fix to migrate your existing configuration to the new format. If you skip this step, OpenClaw may fail to start or behave unexpectedly. The doctor command will automatically detect and fix configuration issues, and it will back up your old config before making changes.

Which video generation providers does OpenClaw 4.5 support?

OpenClaw 4.5 supports four video generation backends: xAI (Grok's video model), Runway (Gen-3 Alpha and Gen-4), Wan (open-source video diffusion), and ComfyUI (for custom Stable Video Diffusion workflows). Each provider has different strengths — xAI for speed, Runway for quality, Wan for cost (free/open-source), and ComfyUI for maximum customization. You configure providers in your OpenClaw config and can switch between them per task.

What is Dreaming and why did it go GA in 4.5?

Dreaming is OpenClaw's autonomous memory consolidation system. It runs during idle periods (typically overnight) and processes your agent's conversation history to extract patterns, preferences, and knowledge — writing consolidated insights to MEMORY.md. It moved from beta to General Availability (GA) in 4.5 after six months of testing proved it stable and effective. GA means it is now production-ready and fully supported with no experimental flags required.

Do I need to pay extra for the new video and music generation features?

OpenClaw itself remains free and open-source — no extra charge for any feature. However, the video and music generation providers have their own costs. Runway and xAI charge per generation. Lyria (Google) and MiniMax have usage-based pricing. Wan and ComfyUI can run locally for free if you have a capable GPU. The costs vary significantly by provider and usage volume, so check each provider's pricing page for current rates.


Further Reading