Remote OpenClaw Blog
Hermes Agent Memory System Explained: How Persistent Memory Works
7 min read ·
Hermes Agent uses a dual memory architecture consisting of bounded local files (MEMORY.md and USER.md) and optional external providers like Honcho for unbounded cross-session user modeling. Unlike most AI agents that forget everything between sessions, Hermes persists knowledge across conversations, curates what it remembers through agent-driven summarization, and can search its own past conversations using FTS5 full-text search over SQLite. As of April 2026, the system ships with 8 external memory provider plugins alongside its built-in memory.
Built-In Local Memory
Hermes Agent ships with two persistent memory files that the agent actively curates across sessions. These files live in the ~/.hermes directory and survive restarts, updates, and reinstalls as long as the data directory is preserved.
MEMORY.md stores agent-curated notes — things the agent has learned about your projects, preferences, environment, and workflows. The agent decides what to write here based on conversation patterns, not explicit user commands. The file is bounded at approximately 2,200 characters, which forces the agent to prioritize the most important information.
USER.md stores a profile of who you are — your role, technical stack, communication preferences, and goals. This file is bounded at approximately 1,375 characters and is updated as the agent learns more about you over time.
Both files are injected into the agent's context at the start of every session, giving it immediate access to everything it has remembered without requiring the user to repeat information. The bounded size keeps context injection fast and prevents memory from consuming too much of the model's context window.
For a broader overview of how AI agents handle memory, see our AI agent memory explained guide.
Memory Types Compared
Hermes Agent organizes knowledge into distinct memory layers, each serving a different purpose and operating on a different timescale.
| Memory Type | How It Works | Persistence | Best For |
|---|---|---|---|
| MEMORY.md (agent notes) | Agent-curated markdown file, ~2,200 char limit | Permanent (disk) | Project details, learned preferences, environment notes |
| USER.md (user profile) | Agent-curated user model, ~1,375 char limit | Permanent (disk) | User role, tech stack, communication style |
| Session history | Full conversation logs stored in SQLite with FTS5 indexing | Permanent (disk) | Recalling past conversations, searching context |
| Honcho conclusions | Dialectic reasoning derives insights about user over time | Permanent (cloud/self-hosted) | Deep user modeling, preference patterns, goal tracking |
| Skills (procedural) | Reusable capabilities created from experience | Permanent (disk) | Learned workflows, task-specific procedures |
| Working context | Current session's conversation window | Session only | Immediate task context, in-progress work |
The layered approach means the agent can recall long-term preferences from MEMORY.md, search weeks-old conversations through session history, and access deep user insights through Honcho — all while keeping the active context window focused on the current task.
Session Search and Recall
Hermes Agent stores all past conversations in a local SQLite database with FTS5 full-text search indexing. This allows the agent to recall specific details from conversations that happened days or weeks ago without those details being stored in the bounded MEMORY.md file.
When the agent needs historical context, it can search its own conversation logs using natural language queries. The FTS5 index makes this search fast even across thousands of past messages. Combined with LLM-powered summarization, the agent can pull relevant context from old conversations and incorporate it into current responses.
This is different from simple chat history scrollback. The agent actively searches its past when it determines that historical context would improve its response — for example, when you reference a project you discussed last week or ask the agent to recall a decision you made earlier.
Honcho: AI-Native User Modeling
Honcho is an AI-native memory backend developed by Plastic Labs that adds dialectic reasoning and deep user modeling on top of Hermes Agent's built-in memory. As of April 2026, it is one of 8 external memory providers supported by Hermes Agent.
Honcho works by analyzing conversations after they happen using a dialectic question-and-answer process. It asks itself questions about what the conversation revealed about the user, then derives "conclusions" — structured insights about user preferences, habits, and goals that accumulate over time.
Marketplace
Free skills and AI personas for OpenClaw — browse the marketplace.
Browse the Marketplace →Honcho Tools
When Honcho is enabled, the agent gains access to four specialized tools:
- honcho_profile — retrieves the accumulated user model
- honcho_search — semantic search across all past conversations
- honcho_context — pulls relevant context for the current task
- honcho_conclude — triggers the dialectic reasoning process to derive new insights
The key difference between Honcho and the built-in MEMORY.md is depth. MEMORY.md stores what the agent explicitly writes down. Honcho derives implicit understanding — patterns the user might not even articulate themselves, like preferring concise answers over detailed explanations, or consistently working on certain types of projects.
To learn more about how Hermes Agent compares to other agent frameworks on memory, see our OpenClaw vs Hermes Agent memory and skills comparison.
Hermes Memory vs OpenClaw MEMORY.md
Both Hermes Agent and OpenClaw use markdown-based memory files, but their approaches differ significantly in structure, curation, and extensibility.
| Feature | Hermes Agent | OpenClaw |
|---|---|---|
| Memory files | MEMORY.md + USER.md (two separate files) | Single MEMORY.md |
| Size limits | Bounded (~2,200 + ~1,375 chars) | No built-in size limit |
| Curation | Agent-driven — agent decides what to remember | User and agent both edit directly |
| Session search | FTS5 over SQLite — full-text search of past conversations | No built-in session search |
| External providers | 8 plugins (Honcho, etc.) — one active at a time | No external memory providers |
| User modeling | Dedicated USER.md + optional Honcho dialectic reasoning | Within MEMORY.md, user-managed |
Hermes Agent's bounded approach forces disciplined curation — the agent must prioritize what matters most within the character limits. OpenClaw's unbounded approach gives users more direct control but can lead to bloated memory files that consume context window space.
Neither approach is universally better. Hermes suits users who want the agent to manage memory autonomously. OpenClaw suits users who want direct, transparent control over what the agent remembers. For a detailed comparison of both platforms, see our OpenClaw vs Hermes Agent guide.
Limitations and Tradeoffs
Hermes Agent's memory system has clear constraints that affect how it behaves in practice.
- Bounded local memory is small. At ~2,200 characters for MEMORY.md, the agent must aggressively prioritize. Important details from early conversations may be overwritten as the agent learns new information. There is no version history for memory files by default.
- Single external provider limit. Only one external memory provider (like Honcho) can be active at a time. You cannot combine Honcho's dialectic reasoning with another provider's capabilities simultaneously.
- Session search is local only. The FTS5 search operates on the local SQLite database. If you run Hermes on multiple machines, conversation history is not synced between them unless you manually share the database.
- Honcho requires additional setup. Honcho is a separate service that needs its own configuration. It adds latency to post-conversation processing and introduces a dependency on either Honcho's cloud service or a self-hosted instance.
- No user-facing memory editor. Unlike OpenClaw where users can directly edit MEMORY.md, Hermes Agent's memory is primarily agent-curated. Users can modify the files manually, but the agent may overwrite changes during its next curation cycle.
Related Guides
- AI Agent Memory Explained
- What Is Hermes Agent?
- OpenClaw vs Hermes Agent: Memory, Skills, and Routing
- OpenClaw MEMORY.md Guide
Frequently Asked Questions
How much can Hermes Agent remember?
Built-in local memory is bounded at approximately 2,200 characters for agent notes (MEMORY.md) and 1,375 characters for user profile (USER.md). These limits keep context injection fast and predictable. For unbounded memory, enable an external provider like Honcho, which stores unlimited conclusions and supports semantic search across all past conversations.
Does Hermes Agent memory persist after restarts?
Yes. Both MEMORY.md and USER.md are written to disk in the ~/.hermes directory. They survive restarts, updates, and even reinstalls as long as the data directory is preserved. For Docker deployments, mount ~/.hermes as a volume to ensure persistence across container rebuilds.
What is Honcho and how does it work with Hermes Agent?
Honcho is an AI-native memory backend developed by Plastic Labs that adds dialectic reasoning and deep user modeling to Hermes Agent. It analyzes conversations after they happen using a question-and-answer process to derive conclusions about user preferences, habits, and goals. These conclusions accumulate over time, giving the agent a deepening understanding beyond what the user explicitly states.
How does Hermes Agent memory compare to OpenClaw MEMORY.md?
Both use markdown files for persistent memory, but the approach differs. OpenClaw uses a single MEMORY.md file that the agent and user both edit directly, with no built-in size limit. Hermes Agent splits memory into two bounded files (MEMORY.md and USER.md) with character limits, and adds agent-curated summarization so the agent decides what to remember. Hermes also offers 8 external memory provider plugins including Honcho for deeper cross-session modeling.
Can I use multiple memory providers at the same time?
No. Hermes Agent supports only one external memory provider active at a time, alongside the built-in MEMORY.md and USER.md. You can switch between providers, but you cannot stack them. The built-in local memory always runs regardless of which external provider is active.