Remote OpenClaw Blog

Claude Models for Hermes Agent — Best Workflows and Use Cases

12 min read · 15 April 2026

Claude Sonnet 4.6 at $3/$15 per million tokens is the best default for Hermes Agent coding and daily operational workflows, while Opus 4.6 at $5/$25 per million tokens should be reserved for complex multi-file analysis and strategic reasoning tasks that genuinely need maximum depth. As of April 2026, Claude models produce the highest-quality code output and the most natural written content of any provider available in Hermes Agent, making them the top choice for workflows where output quality matters more than raw speed.

This post covers practical workflow recipes. For model rankings and API setup, see Claude Models for Hermes — Setup Guide. For OpenClaw configuration, see Claude Models for OpenClaw. For general model benchmarks, see Best Claude Models 2026.

Matching Claude Models to Hermes Tasks

Claude models in Hermes Agent split cleanly into three tiers by task complexity. Sonnet 4.6 covers the broad middle ground, Opus 4.6 handles the top 10-20% of tasks that require deep reasoning, and Haiku 4.5 handles high-volume lightweight work.

The table below maps specific Hermes Agent workflow patterns to the Claude model that delivers the best quality-to-cost ratio for each. Pricing is from the Anthropic models documentation as of April 2026.

Workflow Type	Best Claude Model	Cost (In/Out per MTok)	Why This Model Wins
Code generation and refactoring	Sonnet 4.6	$3.00 / $15.00	Best code quality per dollar; follows conventions and catches edge cases
Multi-file architecture review	Opus 4.6	$5.00 / $25.00	1M context + deep reasoning traces through cross-file dependencies
Bug diagnosis and debugging	Sonnet 4.6	$3.00 / $15.00	Strong at isolating root causes without over-engineering fixes
Blog and newsletter drafting	Sonnet 4.6	$3.00 / $15.00	Best prose quality across all providers; natural tone, no filler
Strategic business analysis	Opus 4.6	$5.00 / $25.00	Synthesizes competing data points into nuanced assessments
Email classification and routing	Haiku 4.5	$1.00 / $5.00	Fast classification at one-third the cost of Sonnet
Data normalization	Haiku 4.5	$1.00 / $5.00	Structured extraction with reliable JSON output at low cost
Legal and compliance review	Opus 4.6	$5.00 / $25.00	Identifies subtle clause interactions; least likely to miss implications
Meeting notes and summaries	Sonnet 4.6	$3.00 / $15.00	Captures decisions and action items without over-summarizing detail

Coding Workflows (Sonnet 4.6)

Claude Sonnet 4.6 produces higher-quality code in Hermes Agent than any other model at its price point. According to Anthropic's model documentation, Sonnet 4.6 scores highest among Claude models on the SWE-bench coding benchmark relative to its cost tier, and in Hermes's agentic context this translates to cleaner implementations, better error handling, and more idiomatic code across Python, TypeScript, and Go.

Recipe: Code Review Agent

This Hermes skill uses Sonnet to review code changes and produce actionable feedback. The key prompt technique is structuring the review criteria explicitly so Sonnet applies each one rather than providing surface-level comments.

# Hermes skill: code-review.md
You are a senior code reviewer. For each code change:

1. Read the full diff and identify the intent of the change
2. Check for these specific issues (in priority order):
   - Security: SQL injection, XSS, auth bypasses, exposed secrets
   - Logic errors: off-by-one, null handling, race conditions
   - Performance: N+1 queries, unnecessary allocations, missing indexes
   - Maintainability: unclear naming, missing types, dead code
3. For each issue found, provide:
   - File and line reference
   - Severity: critical / warning / suggestion
   - A concrete code fix, not just a description of the problem

Rules:
- Only flag real issues. Do not pad the review with style nitpicks.
- If the code is solid, say so in one sentence. Do not invent problems.
- Group related issues together rather than listing them per-line.

Sonnet excels here because it follows the structured criteria without drifting into generic advice. OpenAI's o3 also performs well on code review but costs more and tends to over-explain its reasoning in the output, which adds noise to the review.

Recipe: Automated Refactoring

For refactoring workflows where Hermes needs to modify multiple files while maintaining consistency, Sonnet 4.6's 200K context window handles most codebases. For very large projects, Opus 4.6's 1M context window may be necessary.

# Hermes skill: refactor-migration.md
You are a refactoring specialist. The task is to migrate from [old pattern]
to [new pattern] across the codebase.

1. Search for all files using the old pattern
2. For each file, determine if the pattern can be migrated automatically
   or if it requires manual review due to edge cases
3. Apply the migration to auto-migratable files
4. Produce a summary listing:
   - Files migrated successfully (with brief description of changes)
   - Files requiring manual review (with explanation of the edge case)
   - Files skipped (with reason)

Rules:
- Preserve existing tests. If a migration breaks a test, flag it.
- Do not change unrelated code. Keep diffs minimal.
- Run the project's linter/formatter after each file change.

Complex Analysis Workflows (Opus 4.6)

Claude Opus 4.6 at $5/$25 per million tokens with a 1M context window is the ceiling model for Hermes Agent. Reserve it for tasks where Sonnet's analysis is demonstrably insufficient — typically tasks that require reasoning about relationships across dozens of documents or evaluating trade-offs with many interacting variables.

Recipe: Multi-Document Contract Analysis

This workflow loads multiple contracts into Opus's 1M context and identifies conflicts, missing clauses, and risk areas across the full set. Sonnet can analyze individual contracts, but Opus catches cross-document issues that require holding all terms in active context simultaneously.

# Hermes skill: contract-analysis.md
You are a contract analysis specialist. Given a set of related contracts:

1. Load all provided contracts into context
2. For each contract, extract: parties, effective date, term,
   key obligations, termination clauses, liability caps, IP provisions
3. Cross-reference across all contracts for:
   - Conflicting terms (e.g., different liability caps for same service)
   - Missing clauses present in other contracts (inconsistency)
   - Overlapping obligations that could create dual-liability
   - Termination provisions that conflict with dependent agreements
4. Produce a risk matrix: issue, affected contracts, severity, recommendation

Flag any clause where the financial exposure exceeds $50,000.
Do not provide legal advice — present findings as analytical observations.

This skill genuinely requires Opus because cross-referencing terms across 5+ contracts can exceed 100K tokens of source material. Sonnet's 200K context can technically hold this, but its reasoning quality degrades on cross-document synthesis beyond approximately 80K tokens of dense legal text. Opus maintains analytical consistency across its full 1M window according to Anthropic's documentation.

Recipe: Strategic Decision Framework

For business analysis where Hermes Agent needs to evaluate multiple options against competing criteria with imperfect data, Opus produces more nuanced assessments than any other model available in Hermes.

# Hermes skill: strategic-analysis.md
You are a strategic advisor. Evaluate the decision with these steps:

1. Identify all options (stated and unstated alternatives)
2. For each option, analyze against these criteria:
   - Financial impact (quantify where possible, estimate ranges elsewhere)
   - Implementation complexity (timeline, dependencies, resource needs)
   - Risk factors (what could go wrong, probability, mitigation options)
   - Opportunity cost (what you give up by choosing this option)
3. Present a weighted comparison matrix
4. Provide a recommendation with explicit confidence level and caveats

Rules:
- State assumptions clearly. Do not hide uncertainty behind confident language.
- If data is insufficient for a reliable estimate, say so and describe
  what additional data would improve the analysis.
- Consider second-order effects: how does each option change future decisions?

Content and Writing Workflows

Claude models produce the best written content of any provider available in Hermes Agent as of April 2026. Sonnet 4.6 is the default choice for content workflows because it writes in a natural, direct style without the over-structured output that OpenAI reasoning models tend to produce, and without the occasional verbosity that Gemini models exhibit on long-form content.

Cost Optimizer

Cost Optimizer is the easiest first purchase when you want lower model spend without rebuilding your workflow stack.

Start With Cost Optimizer →Compare Best Fits →

Recipe: Editorial Content Pipeline

This workflow chain handles the full content production cycle — from research through drafting to social media distribution — using Sonnet for every stage.

# Hermes skill: editorial-pipeline.md
You are a content editor. Follow this production pipeline:

Phase 1 — Research:
- Search for 5 recent sources on the assigned topic
- Extract key facts, statistics, and expert quotes with citations
- Identify the angle that differentiates this piece from existing coverage

Phase 2 — Outline:
- Create a structure with H2 sections that builds a logical argument
- Each section should have a clear thesis and supporting evidence
- Include a TL;DR at the top and FAQ at the bottom

Phase 3 — Draft:
- Write in active voice, short paragraphs (3 sentences max)
- Lead every section with a standalone factual sentence
- Integrate citations inline, not as footnotes
- Target [word count] words — do not pad with filler

Phase 4 — Distribution:
- Generate 3 social variants: LinkedIn (professional), Twitter (concise),
  newsletter teaser (curiosity-driven)
- Each variant must stand alone without requiring the full article

Sonnet vs. Opus for Content

Sonnet 4.6 handles most content tasks because the quality difference between Sonnet and Opus on standard blog posts, newsletters, and social content is minimal. Opus becomes worthwhile for content that requires deep domain synthesis — think whitepapers that draw conclusions from multiple research papers, or technical guides that need to reconcile conflicting documentation sources. For standard editorial content, Sonnet delivers 90% of Opus's quality at 60% of the cost.

Claude-Specific Prompt Patterns for Hermes

Claude models respond to prompt structure differently than OpenAI or Gemini models in Hermes Agent. These patterns are tested specifically in Hermes's agent loop context.

Sonnet 4.6 Prompt Patterns

Use XML tags for structure. Claude models parse XML-tagged sections more reliably than markdown headers within skill definitions. Wrap distinct instruction blocks in tags like <task>, <rules>, <output-format> for cleaner separation.
State what NOT to do. Claude follows negative instructions more precisely than most models. "Do not add explanatory comments to the code" is more effective than "Write clean code" for controlling output.
Leverage prompt caching. Hermes sends tool definitions as system prompt content on every turn. Anthropic's 90% cache discount means repeated tool definitions cost one-tenth of normal input pricing. Structure your skills so the static instructions are at the top and variable inputs are at the bottom — this maximizes the cached portion.
Avoid chain-of-thought instructions. Unlike OpenAI's o-series, Sonnet does not benefit from "think step by step" instructions. It already reasons internally. Adding explicit chain-of-thought instructions often produces verbose output without improving quality.

Opus 4.6 Prompt Patterns

Give Opus room to think. Unlike Sonnet, Opus benefits from longer context and more nuanced instructions. Do not over-constrain Opus with rigid output templates — it produces better analysis when allowed to structure its reasoning naturally.
Use Opus for ambiguity. When the task requirements are genuinely unclear or contradictory, Opus is the only Claude model that reliably surfaces the ambiguity rather than silently picking one interpretation. Frame tasks with "Identify any areas where these requirements conflict" to activate this behavior.
Set extended thinking. For Hermes workflows where analysis quality matters more than latency, enable extended thinking in your Hermes config. This gives Opus additional internal reasoning budget similar to OpenAI's o-series models. See the memory system guide for how extended thinking interacts with Hermes's context management.

Haiku 4.5 Prompt Patterns

Be extremely explicit. Haiku has less capacity for inferring intent. Spell out every step, provide exact output formats, and include examples. Ambiguity that Sonnet handles gracefully will produce inconsistent results on Haiku.
Use for routing, not reasoning. Haiku's sweet spot in Hermes is classifying inputs and routing them to the correct workflow. Use it as a triage layer that decides which model should handle the actual work.

Limitations and Tradeoffs

Claude models in Hermes Agent have specific constraints that affect workflow design.

Sonnet refuses certain tasks. Claude models have stronger content filters than OpenAI or Gemini. Workflows involving sensitive topics (medical advice, legal strategy, security testing) may trigger refusals. This is a model-level constraint, not a Hermes configuration issue — switching to a different provider is the only workaround.
Opus cost adds up quickly. At $5/$25 per million tokens, running Opus on routine tasks is expensive. A 10-message Hermes conversation with Opus can cost $0.50-$2.00 depending on context size. Use the task routing table above to avoid running Opus on tasks where Sonnet is sufficient.
200K context ceiling on Sonnet. For workflows that need to hold an entire large codebase or multiple long documents, Sonnet's 200K limit forces you to either upgrade to Opus or redesign the workflow to process documents in chunks. Chunked processing loses cross-document coherence.
No vision in Hermes via Claude. As of April 2026, Hermes Agent's vision pipeline does not fully support Claude's multimodal capabilities. If your workflow requires image analysis, you may need to use OpenAI's GPT-4o as an auxiliary model or wait for Hermes to add Claude vision support.
Rate limits are lower than OpenAI. Anthropic's API rate limits are more restrictive than OpenAI's, especially on Opus. High-volume Hermes deployments running batch workflows may need to implement request queuing or switch to Sonnet for throughput-sensitive tasks.

Related Guides

FAQ

Which Claude model is best for coding in Hermes Agent?

Claude Sonnet 4.6 at $3/$15 per million tokens is the best Claude model for coding workflows in Hermes Agent. It produces the highest-quality code relative to cost, with strong convention-following, edge case handling, and idiomatic output across Python, TypeScript, and Go. Use Opus 4.6 only for multi-file architecture reviews where cross-file reasoning is critical.

Is Claude Opus 4.6 worth the cost for Hermes Agent?

Opus 4.6 is worth the $5/$25 per million token cost only for tasks that genuinely require its depth: multi-document contract analysis, strategic decision frameworks, architecture reviews spanning large codebases, and tasks where Sonnet's analysis is demonstrably insufficient. For daily coding, content drafting, and standard operations, Sonnet delivers 90% of Opus's quality at 60% of the cost.

How does Claude compare to OpenAI for Hermes Agent content workflows?

Claude produces noticeably better written content than OpenAI models in Hermes Agent. Sonnet 4.6 writes in a more natural, direct style without the over-structured output that OpenAI's reasoning models tend to produce. For content workflows — blog drafts, newsletters, social media, editorial content — Claude is the stronger choice. OpenAI's o3 is better for research and data synthesis tasks that feed into content but are not the writing itself.

What is the cheapest Claude model that works with Hermes Agent?

Haiku 4.5 at $1/$5 per million tokens is the cheapest Claude model compatible with Hermes Agent. It works well for email classification, input routing, data normalization, and other structured tasks that do not require deep reasoning. It is not recommended as a primary model for complex agent workflows, but it is excellent as a triage layer that routes tasks to Sonnet or Opus.

Does Claude's prompt caching work with Hermes Agent?

Yes. Anthropic's prompt caching gives a 90% discount on repeated content in the system prompt. Since Hermes Agent sends tool definitions as system prompt content on every turn, this discount applies automatically and significantly reduces per-turn costs. Structure your skill definitions with static instructions at the top and variable inputs at the bottom to maximize the cached portion.

Frequently Asked Questions

Which Claude model is best for coding in Hermes Agent?

Is Claude Opus 4.6 worth the cost for Hermes Agent?

How does Claude compare to OpenAI for Hermes Agent content workflows?

What is the cheapest Claude model that works with Hermes Agent?

Does Claude's prompt caching work with Hermes Agent?

Ready to choose the right OpenClaw workflow?

Cost OptimizerCost Optimizer is the easiest first purchase when you want lower model spend without rebuilding your workflow stack.Compare Best FitsUse the marketplace filters to choose the right bundle, persona, or skill without browsing blind.More GuidesBrowse 200+ free OpenClaw guides, tutorials, and comparisons.

Loading article