Remote OpenClaw Blog

Claude Mythos: Anthropic's Most Powerful Model Explained

6 min read · 22 May 2026

Claude Mythos is the most capable AI model Anthropic has ever built, scoring 93.9% on SWE-bench Verified and leading 17 of 18 benchmarks the company measured. Announced on April 7, 2026 as "Mythos Preview," the model is not publicly available — access is restricted to Project Glasswing partner organizations for defensive cybersecurity work.

What Is Claude Mythos?

Claude Mythos — internally codenamed "Capybara" — is Anthropic's frontier AI model, released as Mythos Preview on April 7, 2026. It is a general-purpose model, not a cybersecurity-only tool, but its coding and reasoning capabilities have crossed a threshold where it can autonomously find and exploit software vulnerabilities at a level that Fortune described as a "step change in capabilities."

Anthropic's own characterization is that Mythos can "surpass all but the most skilled humans at finding and exploiting software vulnerabilities." This is what led to the decision to restrict access rather than release publicly. For more on the security initiative itself, see our companion guide: Claude Mythos Preview and Project Glasswing Explained.

Benchmark Performance

Claude Mythos leads 17 of 18 benchmarks Anthropic measured, with particularly large improvements in coding and mathematics. The benchmark results represent the largest single-generation improvement Anthropic has ever shipped.

Benchmark	Claude Mythos	Claude Opus 4.6	Improvement
SWE-bench Verified	93.9%	80.8%	+13.1 pts
SWE-bench Pro	77.8%	—	New SOTA
Terminal-Bench 2.0	82.0%	—	—
USAMO 2026	97.6%	42.3%	+55.3 pts
GPQA Diamond	94.5%	—	—
HLE (with tools)	64.7%	—	—
OSWorld	79.6%	—	—
BrowseComp	86.9%	—	—
CyberGym	83.1%	66.6%	+16.5 pts

The most dramatic result is USAMO 2026 — a 55.3-point leap from 42.3% to 97.6%, indicating near-perfect competition-level mathematics. On SWE-bench Verified, the 93.9% score is 13.1 points above Opus 4.6 and far ahead of GPT-5.4.

Core Capabilities for Developers

Mythos is a general-purpose model with frontier-level performance across coding, reasoning, mathematics, and agentic workflows. According to Anthropic's system card, the model autonomously discovered a 17-year-old remote code execution vulnerability in FreeBSD with no human involvement after the initial request.

For coding tasks, Mythos demonstrates substantially stronger long-horizon planning — it can maintain coherent multi-file refactors across large codebases in a way that previous models could not sustain. The Terminal-Bench 2.0 score of 82.0% reflects real-world terminal-based development workflows, not isolated code generation.

On agentic benchmarks, Mythos scores 79.6% on OSWorld (full desktop automation) and 86.9% on BrowseComp (web research tasks). These scores suggest the model can reliably complete multi-step tasks involving real applications, browsers, and file systems.

The GPQA Diamond score of 94.5% places Mythos at expert-level performance on graduate-level science questions, suggesting strong general reasoning beyond just code.

Access and Pricing

Claude Mythos Preview is not available through self-serve API signup. As of April 2026, access is restricted to Project Glasswing participants and approximately 40 additional organizations responsible for maintaining critical software infrastructure.

Best Next Step

If that last section felt like a lot - use the marketplace to find the configured version.

Find Your Workflow →Compare Best Fits →

Pricing for approved participants is $25 per million input tokens and $125 per million output tokens — 5x the rate of Claude Opus 4.6. Anthropic has committed $100 million in usage credits for Glasswing participants to offset this cost.

The model is available through Claude API, Amazon Bedrock, Google Vertex AI, and Microsoft Foundry — but only for organizations with approved Glasswing access. There is no public waitlist. Anthropic has stated it does not plan to make Mythos Preview generally available.

What This Means for Non-Glasswing Developers

If you do not have Glasswing access, Claude Opus 4.6 remains the most capable publicly available Claude model. For OpenClaw and Hermes Agent workflows, Opus 4.6 scores 80.8% on SWE-bench Verified and is the recommended Claude model for production agent tasks.

Mythos vs Opus 4.6: What Changed

Mythos is not a minor iteration. The improvements over Opus 4.6 are the largest single-generation jump in Anthropic's history, particularly in coding (+13.1 pts SWE-bench), mathematics (+55.3 pts USAMO), and cybersecurity (+16.5 pts CyberGym).

Anthropic has not disclosed the architectural differences between Mythos and Opus 4.6. Whether Mythos uses a larger parameter count, a different training approach, or both remains unknown. What is publicly known is that the model's capabilities in vulnerability discovery are qualitatively different — where Opus 4.6 produced 2 working Firefox exploits in testing, Mythos produced 181.

For practical coding work, the gap is meaningful but less dramatic than the security results. A 13.1-point improvement on SWE-bench Verified means Mythos can resolve substantially more real-world GitHub issues autonomously, but Opus 4.6 at 80.8% is already highly capable for most production use cases.

Limitations and Tradeoffs

The primary limitation of Claude Mythos is availability — most developers cannot access it. Anthropic's decision to restrict access is based on the model's cybersecurity capabilities, not general limitations.

Pricing is another constraint. At $25/$125 per million tokens, Mythos is 5x more expensive than Opus 4.6. Even for approved organizations with usage credits, this cost structure limits Mythos to high-value tasks rather than general-purpose agent workloads.

There is no self-hosted option for Mythos. Unlike open-source models such as GLM-5, Mythos cannot be run on your own infrastructure. All inference goes through Anthropic's cloud or approved cloud partners.

When NOT to use Mythos (even if you have access): routine coding tasks where Opus 4.6 performs adequately, cost-sensitive workloads, any task where data must remain entirely on-premises, or when you need multi-model fallback chains.

Related Guides

FAQ

Can I use Claude Mythos with OpenClaw or Hermes Agent?

Not currently. Claude Mythos Preview is restricted to Project Glasswing partner organizations and is not available through the standard Claude API. If Anthropic eventually opens general access, it would work with any framework that supports the Claude API, including OpenClaw and Hermes Agent.

How much does Claude Mythos cost per token?

Claude Mythos Preview costs $25 per million input tokens and $125 per million output tokens for Project Glasswing participants. This is 5x the cost of Claude Opus 4.6. Anthropic is providing $100 million in usage credits to offset costs for participants.

Will Claude Mythos be released to the public?

Anthropic has stated it does not plan to make Mythos Preview generally available. The decision is driven by the model's ability to autonomously find and exploit software vulnerabilities. Whether a future, capability-gated version will be released is unknown.

How does Claude Mythos compare to GPT-5.4?

Claude Mythos significantly outperforms GPT-5.4 on published benchmarks. On SWE-bench Verified, Mythos scores 93.9% compared to GPT-5.4's lower score. On USAMO 2026 and CyberGym, the gaps are even larger. However, GPT-5.4 is publicly available while Mythos is not.

What is the difference between this guide and the Glasswing guide?

This guide focuses on the Mythos model itself — benchmarks, capabilities, pricing, and what it means for developers. The Glasswing guide covers the cybersecurity initiative, the partner organizations involved, the specific vulnerabilities discovered, and the broader implications for software security.

Frequently Asked Questions

Can I use Claude Mythos with OpenClaw or Hermes Agent?

How much does Claude Mythos cost per token?

What is the difference between this guide and the Glasswing guide?

Ready to choose the right OpenClaw workflow?

Best Next StepIf that last section felt like a lot - use the marketplace to find the configured version.Browse AI Agent SkillsUse the skills hub to move from research into the right ecosystem, use case, and install path.Get the setup + buying PDFDownload the playbook if you want the exact steps, tradeoffs, and product picks in one guide.

Loading article

Claude Mythos: Anthropic's Most Powerful Model Explained

What Is Claude Mythos?

Benchmark Performance

Core Capabilities for Developers

Access and Pricing

What This Means for Non-Glasswing Developers

Mythos vs Opus 4.6: What Changed

Limitations and Tradeoffs

Related Guides

FAQ

Can I use Claude Mythos with OpenClaw or Hermes Agent?

How much does Claude Mythos cost per token?

Will Claude Mythos be released to the public?

How does Claude Mythos compare to GPT-5.4?

What is the difference between this guide and the Glasswing guide?

Frequently Asked Questions

Can I use Claude Mythos with OpenClaw or Hermes Agent?

How much does Claude Mythos cost per token?

What is the difference between this guide and the Glasswing guide?

Related Skills

Related Guides

Ready to choose the right OpenClaw workflow?