Remote OpenClaw Blog

How to Give OpenClaw Permanent Memory: The Complete Guide [2026]

Published: 1 March 2026·Last Updated: 25 March 2026

What changed

This post was reviewed and updated to reflect current deployment, security hardening, and operations guidance.

What should operators know about How to Give OpenClaw Permanent Memory: The Complete Guide [2026]?

Answer: This was the subject of a 423-point, 149-comment Reddit thread, and for good reason — memory is the feature that transforms OpenClaw from a stateless chatbot into something that actually feels like an assistant that knows you. This guide covers practical deployment decisions, security controls, and operations steps to run OpenClaw, ClawDBot, or MOLTBot reliably in production on.

Updated: 25 March 2026 · Author: Zac Frulloni

Complete guide to OpenClaw permanent memory. How memory.md files work, QMD format, memory search, context injection, Obsidian integration, limits, and pruning strategies.

Marketplace

Free skills and AI personas for OpenClaw — deploy a pre-built agent in 15 minutes.

Browse the Marketplace →

Join the Community

Join 500+ OpenClaw operators sharing deployment guides, security configs, and workflow automations.

Join the Community →

How Does OpenClaw Memory Actually Work?

This was the subject of a 423-point, 149-comment Reddit thread, and for good reason — memory is the feature that transforms OpenClaw from a stateless chatbot into something that actually feels like an assistant that knows you.

At its core, OpenClaw memory is simple: markdown files stored in the agent's data directory. These files contain information that the agent can reference during conversations. When you start a conversation, relevant memory files (or portions of them) are loaded into the AI model's context window alongside your message and conversation history.

The key word is "relevant." Early OpenClaw versions loaded all memory files into every conversation, which burned through tokens and made costs spiral. Modern versions use memory search to retrieve only the files (or sections) that are relevant to the current conversation topic. This means you can have gigabytes of memory files without paying to load them all into every API call.

Memory files are created in three ways:

Manual creation. You write markdown files and place them in the memory directory. This is the recommended starting point.
Agent-created. You can instruct the agent to save important information to memory during conversations. "Remember that Client X prefers email over phone calls" becomes a memory entry.
External sync. Tools like Obsidian, Notion (via export), or custom scripts can sync content into the memory directory.

What Is QMD Format and Why Does It Matter?

QMD (Quick Memory Document) is a structured markdown convention that makes memory files more efficient for AI retrieval. It is not a new file format — it is a set of conventions for organizing standard markdown files so OpenClaw can parse them quickly.

A QMD file looks like this:

---
tags: [client, acme-corp, active]
updated: 2026-03-20
---

# ACME Corp - Client Profile

## Contact
- Primary: Jane Smith, CEO, jane@acme.com
- Secondary: Bob Johnson, CTO, bob@acme.com

## Engagement
- Started: January 2026
- Type: Monthly retainer, $5,000/month
- Scope: AI automation consulting

## Preferences
- Jane prefers short, direct emails
- Bob wants technical detail in all proposals
- Always CC both on project updates

## Current Projects
- Invoice automation (in progress, due April 15)
- CRM integration (scoping phase)

## History
- 2026-01-15: Signed engagement letter
- 2026-02-01: Completed intake automation
- 2026-03-01: Started invoice project

The frontmatter (the section between the --- markers) contains metadata tags that help memory search find relevant files. The body uses clear headings and bullet points so the AI can quickly locate specific information without reading the entire file.

Why this matters for cost and quality: A well-structured QMD file allows the memory search system to retrieve just the relevant section instead of the entire file. If someone asks about ACME Corp's current projects, the search pulls the "Current Projects" section — not the entire client history. This reduces token usage and keeps the context window focused.

How Does Memory Search Work?

Memory search is the mechanism that decides which memory content is relevant to the current conversation. Without it, every memory file gets loaded into every API call — expensive and noisy. With it, only the relevant context is retrieved.

OpenClaw supports two search approaches:

Keyword search. The simplest approach. The agent's current query is matched against tags and content in memory files. Files with matching keywords are ranked by relevance and the top results are injected into the context. Fast, cheap, and works well for structured memory files with good tags.

Vector search. More sophisticated. Memory files are embedded into a vector database (like ChromaDB or Qdrant), and queries are matched by semantic similarity rather than keyword overlap. This means "What did the CEO of ACME say about our pricing?" can retrieve relevant content even if the word "pricing" doesn't appear in the memory file — as long as the content is semantically related. More accurate but requires additional infrastructure.

For most operators, keyword search with well-tagged QMD files is sufficient. Vector search becomes valuable when you have large memory collections (100+ files) or when your queries frequently use different terminology than your stored content.

How Does Context Injection Put Memory to Use?

Context injection is the process of inserting retrieved memory into the AI model's prompt. Understanding how this works helps you write better memory files.

When you send a message to your agent, the system assembles a prompt that looks approximately like this:

System prompt: The agent's core personality and instructions.
Retrieved memory: Relevant sections from your memory files, based on search results.
Active skill definitions: Any skills that might be relevant to the current message.
Conversation history: Recent messages in the current conversation.
Your message: What you just said.

Memory sits near the top of this stack, which means it gets high priority in the model's attention. This is by design — memory should inform everything the agent says and does.

Practical implication: Write memory files as if you are briefing a new assistant on their first day. Clear, concise, and organized by topic. Avoid narrative prose — use bullet points and structured sections. The AI processes structured information more reliably than unstructured paragraphs.

Marketplace

4 AI personas and 7 free skills — browse the marketplace.

Browse Marketplace →

How Do You Integrate OpenClaw With Obsidian?

Obsidian is a powerful markdown editor with features like backlinks, graph view, tags, and plugins. Since OpenClaw memory files are standard markdown, integrating the two is straightforward.

Option 1: Symlink (recommended). Create a symbolic link from a folder in your Obsidian vault to your OpenClaw memory directory. Any file you create or edit in Obsidian is immediately available to OpenClaw.

ln -s /path/to/openclaw/memory /path/to/obsidian/vault/openclaw-memory

Option 2: Sync tool. Use rsync, Syncthing, or a similar tool to keep an Obsidian folder synchronized with the OpenClaw memory directory. This works well when Obsidian and OpenClaw are on different machines.

Option 3: Obsidian Git plugin. Use the Obsidian Git plugin to push changes to a repository, and configure your OpenClaw server to pull from the same repository on a schedule.

The Obsidian workflow is particularly powerful because you can use Obsidian's graph view to visualize relationships between memory files, use backlinks to cross-reference related information, and use Obsidian's search to find content before the agent does.

What Are the Memory Limits?

Memory limits come from two sources: the AI model's context window and your API budget.

Context window limits. Each AI model has a maximum context window size. Claude Sonnet supports up to 200,000 tokens. GPT-4o supports 128,000. DeepSeek V3 supports 128,000. Your memory retrieval cannot exceed the available context after accounting for the system prompt, conversation history, and skill definitions.

In practice, you want memory to consume no more than 20-30% of your available context. For a 200K context model, that is 40,000-60,000 tokens of memory per call. For a 128K model, it is 25,000-38,000 tokens. This is enough for substantial memory — roughly 30,000-45,000 words of retrieved content.

Cost limits. Every token of memory injected into the context costs money. If you retrieve 10,000 tokens of memory for every API call, and you make 100 calls per day at Claude Sonnet rates ($3/M tokens), that is $3/day just for memory context — $90/month. Efficient memory retrieval (pulling only what is needed) keeps these costs manageable.

Practical limit: Keep individual memory files under 2,000 words. Keep your total memory collection organized with clear tags. Let the search system retrieve only what is relevant rather than loading everything. These practices keep costs low and retrieval quality high.

How Do You Prune Memory Effectively?

Memory files grow over time. Without regular maintenance, they become bloated with outdated information, duplicate entries, and content that no longer serves a purpose. Bloated memory costs more (more tokens retrieved), reduces quality (relevant content gets diluted), and slows search.

Monthly review (30 minutes). Once a month, scan your memory files for:

Outdated information. Clients who are no longer active, projects that are completed, preferences that have changed. Delete or archive.
Duplicate entries. The same information stored in multiple files. Consolidate into one authoritative location.
Overly verbose entries. Sections that could be shortened without losing meaning. Memory files should be concise, not comprehensive.

Quarterly deep clean (1-2 hours). Every three months, do a thorough review:

Reorganize files by current relevance, not creation date.
Split large files (over 2,000 words) into focused topic-specific files.
Update tags and metadata to reflect current priorities.
Archive historical content that has not been retrieved in the past quarter.

Automated pruning. Some operators configure the agent itself to flag memory entries it has not referenced in 30+ days. The agent generates a "memory review report" monthly, listing candidates for deletion or archival. The human reviews and approves.

What Does Good Memory Enable in Practice?

The difference between an agent with good memory and one without is dramatic. Here are real examples:

Without memory: "Schedule a meeting with the client." — "Which client? What type of meeting? What times work for you?"

With memory: "Schedule a meeting with the client." — "I see you have two active clients: ACME Corp (primary contact Jane Smith) and Beta Labs (primary contact Mike Davis). Jane prefers Tuesday mornings and Mike is flexible. Which client, and shall I propose times this week?"

Without memory: "Draft a follow-up email." — Generic, template-style email.

With memory: "Draft a follow-up email." — Personalized email referencing the last conversation topic, the client's communication preferences, and any outstanding action items from previous interactions.

Good memory compounds over time. After a month, the agent knows your patterns, preferences, contacts, and priorities well enough that interactions feel natural and efficient. After three months, it is genuinely hard to imagine going back to a stateless tool. The 423-point Reddit post existed because this is the feature that makes OpenClaw feel like a real assistant — and the one that most people set up incorrectly or not at all.