Remote OpenClaw Blog
Claude Models for Hermes Agent — Best Workflows and Use Cases
12 min read ·
Claude Sonnet 4.6 at $3/$15 per million tokens is the best default for Hermes Agent coding and daily operational workflows, while Opus 4.6 at $5/$25 per million tokens should be reserved for complex multi-file analysis and strategic reasoning tasks that genuinely need maximum depth. As of April 2026, Claude models produce the highest-quality code output and the most natural written content of any provider available in Hermes Agent, making them the top choice for workflows where output quality matters more than raw speed.
This post covers practical workflow recipes. For model rankings and API setup, see Claude Models for Hermes — Setup Guide. For OpenClaw configuration, see Claude Models for OpenClaw. For general model benchmarks, see Best Claude Models 2026.
Matching Claude Models to Hermes Tasks
Claude models in Hermes Agent split cleanly into three tiers by task complexity. Sonnet 4.6 covers the broad middle ground, Opus 4.6 handles the top 10-20% of tasks that require deep reasoning, and Haiku 4.5 handles high-volume lightweight work.
The table below maps specific Hermes Agent workflow patterns to the Claude model that delivers the best quality-to-cost ratio for each. Pricing is from the Anthropic models documentation as of April 2026.
| Workflow Type | Best Claude Model | Cost (In/Out per MTok) | Why This Model Wins |
|---|---|---|---|
| Code generation and refactoring | Sonnet 4.6 | $3.00 / $15.00 | Best code quality per dollar; follows conventions and catches edge cases |
| Multi-file architecture review | Opus 4.6 | $5.00 / $25.00 | 1M context + deep reasoning traces through cross-file dependencies |
| Bug diagnosis and debugging | Sonnet 4.6 | $3.00 / $15.00 | Strong at isolating root causes without over-engineering fixes |
| Blog and newsletter drafting | Sonnet 4.6 | $3.00 / $15.00 | Best prose quality across all providers; natural tone, no filler |
| Strategic business analysis | Opus 4.6 | $5.00 / $25.00 | Synthesizes competing data points into nuanced assessments |
| Email classification and routing | Haiku 4.5 | $1.00 / $5.00 | Fast classification at one-third the cost of Sonnet |
| Data normalization | Haiku 4.5 | $1.00 / $5.00 | Structured extraction with reliable JSON output at low cost |
| Legal and compliance review | Opus 4.6 | $5.00 / $25.00 | Identifies subtle clause interactions; least likely to miss implications |
| Meeting notes and summaries | Sonnet 4.6 | $3.00 / $15.00 | Captures decisions and action items without over-summarizing detail |
Coding Workflows (Sonnet 4.6)
Claude Sonnet 4.6 produces higher-quality code in Hermes Agent than any other model at its price point. According to Anthropic's model documentation, Sonnet 4.6 scores highest among Claude models on the SWE-bench coding benchmark relative to its cost tier, and in Hermes's agentic context this translates to cleaner implementations, better error handling, and more idiomatic code across Python, TypeScript, and Go.
Recipe: Code Review Agent
This Hermes skill uses Sonnet to review code changes and produce actionable feedback. The key prompt technique is structuring the review criteria explicitly so Sonnet applies each one rather than providing surface-level comments.
# Hermes skill: code-review.md
You are a senior code reviewer. For each code change:
1. Read the full diff and identify the intent of the change
2. Check for these specific issues (in priority order):
- Security: SQL injection, XSS, auth bypasses, exposed secrets
- Logic errors: off-by-one, null handling, race conditions
- Performance: N+1 queries, unnecessary allocations, missing indexes
- Maintainability: unclear naming, missing types, dead code
3. For each issue found, provide:
- File and line reference
- Severity: critical / warning / suggestion
- A concrete code fix, not just a description of the problem
Rules:
- Only flag real issues. Do not pad the review with style nitpicks.
- If the code is solid, say so in one sentence. Do not invent problems.
- Group related issues together rather than listing them per-line.
Sonnet excels here because it follows the structured criteria without drifting into generic advice. OpenAI's o3 also performs well on code review but costs more and tends to over-explain its reasoning in the output, which adds noise to the review.
Recipe: Automated Refactoring
For refactoring workflows where Hermes needs to modify multiple files while maintaining consistency, Sonnet 4.6's 200K context window handles most codebases. For very large projects, Opus 4.6's 1M context window may be necessary.
# Hermes skill: refactor-migration.md
You are a refactoring specialist. The task is to migrate from [old pattern]
to [new pattern] across the codebase.
1. Search for all files using the old pattern
2. For each file, determine if the pattern can be migrated automatically
or if it requires manual review due to edge cases
3. Apply the migration to auto-migratable files
4. Produce a summary listing:
- Files migrated successfully (with brief description of changes)
- Files requiring manual review (with explanation of the edge case)
- Files skipped (with reason)
Rules:
- Preserve existing tests. If a migration breaks a test, flag it.
- Do not change unrelated code. Keep diffs minimal.
- Run the project's linter/formatter after each file change.
Complex Analysis Workflows (Opus 4.6)
Claude Opus 4.6 at $5/$25 per million tokens with a 1M context window is the ceiling model for Hermes Agent. Reserve it for tasks where Sonnet's analysis is demonstrably insufficient — typically tasks that require reasoning about relationships across dozens of documents or evaluating trade-offs with many interacting variables.
Recipe: Multi-Document Contract Analysis
This workflow loads multiple contracts into Opus's 1M context and identifies conflicts, missing clauses, and risk areas across the full set. Sonnet can analyze individual contracts, but Opus catches cross-document issues that require holding all terms in active context simultaneously.
# Hermes skill: contract-analysis.md
You are a contract analysis specialist. Given a set of related contracts:
1. Load all provided contracts into context
2. For each contract, extract: parties, effective date, term,
key obligations, termination clauses, liability caps, IP provisions
3. Cross-reference across all contracts for:
- Conflicting terms (e.g., different liability caps for same service)
- Missing clauses present in other contracts (inconsistency)
- Overlapping obligations that could create dual-liability
- Termination provisions that conflict with dependent agreements
4. Produce a risk matrix: issue, affected contracts, severity, recommendation
Flag any clause where the financial exposure exceeds $50,000.
Do not provide legal advice — present findings as analytical observations.
This skill genuinely requires Opus because cross-referencing terms across 5+ contracts can exceed 100K tokens of source material. Sonnet's 200K context can technically hold this, but its reasoning quality degrades on cross-document synthesis beyond approximately 80K tokens of dense legal text. Opus maintains analytical consistency across its full 1M window according to Anthropic's documentation.
Recipe: Strategic Decision Framework
For business analysis where Hermes Agent needs to evaluate multiple options against competing criteria with imperfect data, Opus produces more nuanced assessments than any other model available in Hermes.
# Hermes skill: strategic-analysis.md
You are a strategic advisor. Evaluate the decision with these steps:
1. Identify all options (stated and unstated alternatives)
2. For each option, analyze against these criteria:
- Financial impact (quantify where possible, estimate ranges elsewhere)
- Implementation complexity (timeline, dependencies, resource needs)
- Risk factors (what could go wrong, probability, mitigation options)
- Opportunity cost (what you give up by choosing this option)
3. Present a weighted comparison matrix
4. Provide a recommendation with explicit confidence level and caveats
Rules:
- State assumptions clearly. Do not hide uncertainty behind confident language.
- If data is insufficient for a reliable estimate, say so and describe
what additional data would improve the analysis.
- Consider second-order effects: how does each option change future decisions?
Content and Writing Workflows
Claude models produce the best written content of any provider available in Hermes Agent as of April 2026. Sonnet 4.6 is the default choice for content workflows because it writes in a natural, direct style without the over-structured output that OpenAI reasoning models tend to produce, and without the occasional verbosity that Gemini models exhibit on long-form content.
Cost Optimizer
Cost Optimizer is the easiest first purchase when you want lower model spend without rebuilding your workflow stack.
Recipe: Editorial Content Pipeline
This workflow chain handles the full content production cycle — from research through drafting to social media distribution — using Sonnet for every stage.
# Hermes skill: editorial-pipeline.md
You are a content editor. Follow this production pipeline:
Phase 1 — Research:
- Search for 5 recent sources on the assigned topic
- Extract key facts, statistics, and expert quotes with citations
- Identify the angle that differentiates this piece from existing coverage
Phase 2 — Outline:
- Create a structure with H2 sections that builds a logical argument
- Each section should have a clear thesis and supporting evidence
- Include a TL;DR at the top and FAQ at the bottom
Phase 3 — Draft:
- Write in active voice, short paragraphs (3 sentences max)
- Lead every section with a standalone factual sentence
- Integrate citations inline, not as footnotes
- Target [word count] words — do not pad with filler
Phase 4 — Distribution:
- Generate 3 social variants: LinkedIn (professional), Twitter (concise),
newsletter teaser (curiosity-driven)
- Each variant must stand alone without requiring the full article
Sonnet vs. Opus for Content
Sonnet 4.6 handles most content tasks because the quality difference between Sonnet and Opus on standard blog posts, newsletters, and social content is minimal. Opus becomes worthwhile for content that requires deep domain synthesis — think whitepapers that draw conclusions from multiple research papers, or technical guides that need to reconcile conflicting documentation sources. For standard editorial content, Sonnet delivers 90% of Opus's quality at 60% of the cost.
Claude-Specific Prompt Patterns for Hermes
Claude models respond to prompt structure differently than OpenAI or Gemini models in Hermes Agent. These patterns are tested specifically in Hermes's agent loop context.
Sonnet 4.6 Prompt Patterns
- Use XML tags for structure. Claude models parse XML-tagged sections more reliably than markdown headers within skill definitions. Wrap distinct instruction blocks in tags like
<task>,<rules>,<output-format>for cleaner separation. - State what NOT to do. Claude follows negative instructions more precisely than most models. "Do not add explanatory comments to the code" is more effective than "Write clean code" for controlling output.
- Leverage prompt caching. Hermes sends tool definitions as system prompt content on every turn. Anthropic's 90% cache discount means repeated tool definitions cost one-tenth of normal input pricing. Structure your skills so the static instructions are at the top and variable inputs are at the bottom — this maximizes the cached portion.
- Avoid chain-of-thought instructions. Unlike OpenAI's o-series, Sonnet does not benefit from "think step by step" instructions. It already reasons internally. Adding explicit chain-of-thought instructions often produces verbose output without improving quality.
Opus 4.6 Prompt Patterns
- Give Opus room to think. Unlike Sonnet, Opus benefits from longer context and more nuanced instructions. Do not over-constrain Opus with rigid output templates — it produces better analysis when allowed to structure its reasoning naturally.
- Use Opus for ambiguity. When the task requirements are genuinely unclear or contradictory, Opus is the only Claude model that reliably surfaces the ambiguity rather than silently picking one interpretation. Frame tasks with "Identify any areas where these requirements conflict" to activate this behavior.
- Set extended thinking. For Hermes workflows where analysis quality matters more than latency, enable extended thinking in your Hermes config. This gives Opus additional internal reasoning budget similar to OpenAI's o-series models. See the memory system guide for how extended thinking interacts with Hermes's context management.
Haiku 4.5 Prompt Patterns
- Be extremely explicit. Haiku has less capacity for inferring intent. Spell out every step, provide exact output formats, and include examples. Ambiguity that Sonnet handles gracefully will produce inconsistent results on Haiku.
- Use for routing, not reasoning. Haiku's sweet spot in Hermes is classifying inputs and routing them to the correct workflow. Use it as a triage layer that decides which model should handle the actual work.
Limitations and Tradeoffs
Claude models in Hermes Agent have specific constraints that affect workflow design.
- Sonnet refuses certain tasks. Claude models have stronger content filters than OpenAI or Gemini. Workflows involving sensitive topics (medical advice, legal strategy, security testing) may trigger refusals. This is a model-level constraint, not a Hermes configuration issue — switching to a different provider is the only workaround.
- Opus cost adds up quickly. At $5/$25 per million tokens, running Opus on routine tasks is expensive. A 10-message Hermes conversation with Opus can cost $0.50-$2.00 depending on context size. Use the task routing table above to avoid running Opus on tasks where Sonnet is sufficient.
- 200K context ceiling on Sonnet. For workflows that need to hold an entire large codebase or multiple long documents, Sonnet's 200K limit forces you to either upgrade to Opus or redesign the workflow to process documents in chunks. Chunked processing loses cross-document coherence.
- No vision in Hermes via Claude. As of April 2026, Hermes Agent's vision pipeline does not fully support Claude's multimodal capabilities. If your workflow requires image analysis, you may need to use OpenAI's GPT-4o as an auxiliary model or wait for Hermes to add Claude vision support.
- Rate limits are lower than OpenAI. Anthropic's API rate limits are more restrictive than OpenAI's, especially on Opus. High-volume Hermes deployments running batch workflows may need to implement request queuing or switch to Sonnet for throughput-sensitive tasks.
Related Guides
- Best Claude Models for Hermes — Setup Guide
- Best Claude Models for OpenClaw
- Best Claude Models 2026
- Hermes Agent Cost Breakdown
FAQ
Which Claude model is best for coding in Hermes Agent?
Claude Sonnet 4.6 at $3/$15 per million tokens is the best Claude model for coding workflows in Hermes Agent. It produces the highest-quality code relative to cost, with strong convention-following, edge case handling, and idiomatic output across Python, TypeScript, and Go. Use Opus 4.6 only for multi-file architecture reviews where cross-file reasoning is critical.
Is Claude Opus 4.6 worth the cost for Hermes Agent?
Opus 4.6 is worth the $5/$25 per million token cost only for tasks that genuinely require its depth: multi-document contract analysis, strategic decision frameworks, architecture reviews spanning large codebases, and tasks where Sonnet's analysis is demonstrably insufficient. For daily coding, content drafting, and standard operations, Sonnet delivers 90% of Opus's quality at 60% of the cost.
How does Claude compare to OpenAI for Hermes Agent content workflows?
Claude produces noticeably better written content than OpenAI models in Hermes Agent. Sonnet 4.6 writes in a more natural, direct style without the over-structured output that OpenAI's reasoning models tend to produce. For content workflows — blog drafts, newsletters, social media, editorial content — Claude is the stronger choice. OpenAI's o3 is better for research and data synthesis tasks that feed into content but are not the writing itself.
What is the cheapest Claude model that works with Hermes Agent?
Haiku 4.5 at $1/$5 per million tokens is the cheapest Claude model compatible with Hermes Agent. It works well for email classification, input routing, data normalization, and other structured tasks that do not require deep reasoning. It is not recommended as a primary model for complex agent workflows, but it is excellent as a triage layer that routes tasks to Sonnet or Opus.
Does Claude's prompt caching work with Hermes Agent?
Yes. Anthropic's prompt caching gives a 90% discount on repeated content in the system prompt. Since Hermes Agent sends tool definitions as system prompt content on every turn, this discount applies automatically and significantly reduces per-turn costs. Structure your skill definitions with static instructions at the top and variable inputs at the bottom to maximize the cached portion.
Frequently Asked Questions
Which Claude model is best for coding in Hermes Agent?
Claude Sonnet 4.6 at $3/$15 per million tokens is the best Claude model for coding workflows in Hermes Agent. It produces the highest-quality code relative to cost, with strong convention-following, edge case handling, and idiomatic output across Python, TypeScript, and Go. Use Opus 4.6 only for multi-file architecture reviews where cross-file reasoning is critical.
Is Claude Opus 4.6 worth the cost for Hermes Agent?
Opus 4.6 is worth the $5/$25 per million token cost only for tasks that genuinely require its depth: multi-document contract analysis, strategic decision frameworks, architecture reviews spanning large codebases, and tasks where Sonnet's analysis is demonstrably insufficient. For daily coding, content drafting, and standard operations, Sonnet delivers 90% of Opus's quality at 60% of the cost.
How does Claude compare to OpenAI for Hermes Agent content workflows?
Claude produces noticeably better written content than OpenAI models in Hermes Agent. Sonnet 4.6 writes in a more natural, direct style without the over-structured output that OpenAI's reasoning models tend to produce. For content workflows — blog drafts, newsletters, social media, editorial content — Claude is the stronger choice. OpenAI's o3 is better for research and data synthesis tasks that
What is the cheapest Claude model that works with Hermes Agent?
Haiku 4.5 at $1/$5 per million tokens is the cheapest Claude model compatible with Hermes Agent. It works well for email classification, input routing, data normalization, and other structured tasks that do not require deep reasoning. It is not recommended as a primary model for complex agent workflows, but it is excellent as a triage layer that routes tasks to
Does Claude's prompt caching work with Hermes Agent?
Yes. Anthropic's prompt caching gives a 90% discount on repeated content in the system prompt. Since Hermes Agent sends tool definitions as system prompt content on every turn, this discount applies automatically and significantly reduces per-turn costs. Structure your skill definitions with static instructions at the top and variable inputs at the bottom to maximize the cached portion.