Remote OpenClaw

Remote OpenClaw Blog

MiniMax M2 on OpenClaw: Setup, Pricing, and Performance Guide

9 min read ·

What Is MiniMax M2?

MiniMax is a Chinese AI company that has been building large language models since 2021, primarily focused on achieving high performance at low inference cost. Their M2 family is the latest product of that philosophy — a Mixture of Experts architecture that activates only 10 billion parameters per forward pass out of 230 billion total, resulting in a model that punches far above its compute weight class.

The pitch is simple: 90% of frontier model quality at 7% of the cost. That sounds like marketing, but the benchmarks largely back it up. M2.7, the latest version, scores 78% on SWE-bench Verified — within one percentage point of Claude Sonnet 4 — while costing $0.30 per million input tokens compared to Claude's $3.00.

For OpenClaw operators, MiniMax is especially interesting because the M2.5 variant is completely free on OpenRouter. That means you can test and run OpenClaw agent workflows without spending a single dollar on inference, then upgrade to M2.7 when you need the extra capability.


M2, M2.5, and M2.7: Understanding the Lineage

MiniMax has released three versions in the M2 family, each building on the same base architecture with incremental improvements:

Version Release Key Improvement Availability
M2 Late 2025 Original release: 230B MoE, 10B active, strong multilingual HuggingFace, self-host
M2.5 January 2026 Improved instruction following, tool use, and safety alignment OpenRouter (FREE), HuggingFace
M2.7 March 2026 Highest benchmarks (78% SWE-bench), improved reasoning chains OpenRouter ($0.30/$1.20)

The naming convention is straightforward: M2 is the base, and the decimal indicates the post-training iteration. M2.5 added better tool calling and instruction adherence. M2.7 pushed benchmark scores higher with improved reasoning and chain-of-thought capabilities.

For most OpenClaw operators, the decision is between M2.5 (free, good enough for testing and light workloads) and M2.7 (paid, best performance for production use).


Architecture and Specifications

Specification Value
Total Parameters 230 billion
Active Parameters ~10 billion per forward pass
Architecture Mixture of Experts (MoE)
Inference Speed ~100 tokens/second
Context Window 128K tokens
Modalities Text (M2.7 adds limited vision)
Tool Use Yes (OAuth plugin support)
Developer MiniMax (Beijing)

The 10B active parameter count is the key to MiniMax's cost efficiency. While the model has access to 230B parameters worth of learned knowledge through its expert routing system, each inference pass only computes through 10B — roughly the size of a small Llama model. This is why MiniMax can price M2.7 at $0.30 per million input tokens while maintaining strong performance: the compute cost per token is genuinely low.

The ~100 tokens per second throughput is also notable. This is faster than many larger models (GLM-5 runs at ~69 tok/s, for comparison) and fast enough that users do not experience noticeable delays during real-time agent interactions.


Benchmarks and Performance

Here are M2.7's headline benchmark results and how they compare to the models OpenClaw operators typically use:

Benchmark M2.7 Claude Sonnet 4 GPT-4.1
SWE-bench Verified 78% ~79% ~78%
HumanEval 89.5% ~92% ~90%
MMLU 86.8% ~90% ~89%
MATH 84.2% ~86% ~85%

The pattern is consistent: M2.7 lands within 1-3 percentage points of Claude Sonnet and GPT-4.1 across every major benchmark. It is not quite as capable on the margins — the gap shows up most on complex, multi-step reasoning chains and creative tasks — but for the vast majority of practical agent workflows, the difference is imperceptible.

The 78% SWE-bench Verified score is the most relevant metric for OpenClaw operators running coding agents. This means M2.7 can autonomously resolve roughly four out of five real-world GitHub issues — reading the codebase, understanding the bug, and generating a working fix. That is production-grade reliability for most software engineering tasks.


Pricing: The 90/7 Equation

MiniMax's pricing is the strongest argument for using it with OpenClaw. Here is the full breakdown:

Model Input (per 1M tokens) Output (per 1M tokens) Monthly Cost (10M tokens/day)
MiniMax M2.5 FREE FREE $0
MiniMax M2.7 $0.30 $1.20 ~$225
Claude Sonnet 4 $3.00 $15.00 ~$2,700
GPT-4.1 $2.00 $8.00 ~$1,500

The "90% quality at 7% cost" claim comes from dividing M2.7's OpenRouter cost by Claude Sonnet's: $0.30 / $3.00 = 10% input cost, $1.20 / $15.00 = 8% output cost. Blended, you are paying roughly 7-10% of what you would spend on Claude for an agent that scores within 1% on SWE-bench.

For an OpenClaw operator running a busy coding agent — processing 10 million tokens per day across dozens of tasks — the difference between $225/month (M2.7) and $2,700/month (Claude Sonnet) is substantial. That is over $2,400/month in savings, or nearly $30,000/year.

And if you are just getting started or running a low-volume agent, M2.5 at zero cost means you can operate indefinitely without any inference spending.


Setup via OpenRouter

OpenRouter is the simplest way to connect MiniMax to OpenClaw. Both M2.5 (free) and M2.7 (paid) are available through the same API.

Step 1: Get an OpenRouter API Key

Sign up at openrouter.ai and generate an API key. For M2.5, you do not need to add credits — it is free. For M2.7, add at least $5 to start.

Step 2: Configure OpenClaw for M2.5 (Free)

# In your OpenClaw config (e.g., ~/.openclaw/config.yaml)
llm:
  provider: openrouter
  model: minimax/m2.5-free
  api_key: your-openrouter-api-key
  temperature: 0.7
  max_tokens: 8192

Step 3: Or Configure for M2.7 (Best Performance)

# In your OpenClaw config (e.g., ~/.openclaw/config.yaml)
llm:
  provider: openrouter
  model: minimax/m2.7
  api_key: your-openrouter-api-key
  temperature: 0.7
  max_tokens: 8192

Step 4: Start OpenClaw

openclaw start

A practical workflow for new operators: start with M2.5 to build and test your agent workflows at zero cost. Once you have validated that the agent is working correctly, switch to M2.7 for better reasoning and benchmark performance. The configuration change is a single line.

Marketplace

Free skills and AI personas for OpenClaw — browse the marketplace.

Browse the Marketplace →

OAuth Plugin Configuration

MiniMax M2 models support OAuth-based plugin authentication, which enables OpenClaw to use external tools and APIs through the model's native tool-calling capabilities. This is useful for workflows that require web browsing, database queries, or third-party API calls.

Step 1: Register Your Plugin

In your OpenClaw plugin configuration, define the OAuth credentials for the service you want to connect:

# In ~/.openclaw/plugins/my-tool.yaml
name: my-external-tool
auth:
  type: oauth2
  client_id: your-client-id
  client_secret: your-client-secret
  token_url: https://api.example.com/oauth/token
  scopes:
    - read
    - write

Step 2: Enable the Plugin in OpenClaw

# In your OpenClaw config
plugins:
  enabled:
    - my-external-tool
  auto_authenticate: true

Step 3: Test Tool Calling

# Start OpenClaw and test a tool-calling workflow
openclaw start
# Then ask your agent to use the external tool
# MiniMax will automatically authenticate via OAuth

MiniMax's tool-calling reliability is strong — M2.7 correctly formats and executes tool calls in the vast majority of cases. For complex multi-tool workflows where the agent needs to chain several API calls together, M2.7 handles sequencing and error recovery well.


When MiniMax Beats Expensive Models

MiniMax M2 is the right choice for your OpenClaw agent in these specific scenarios:

  • Budget-constrained production agents: If you are running an agent that processes high token volumes — customer support, code review, document processing — M2.7's 7% cost ratio means you can scale 10x further on the same budget compared to Claude or GPT-4.
  • Zero-cost prototyping: M2.5 on OpenRouter is free. If you are building a new agent workflow and want to iterate without worrying about API costs, M2.5 lets you run hundreds of test cycles at zero cost before committing to a paid model.
  • Speed-sensitive workflows: At ~100 tokens per second, M2.7 is faster than most open models and competitive with proprietary APIs. For real-time agent interactions where users are waiting for responses, this throughput matters.
  • Coding and structured tasks: M2.7's 78% SWE-bench score means it handles code generation, bug fixing, and code review at near-frontier quality. If your agent primarily works with code and structured data rather than creative writing, M2.7 is a strong fit.
  • Gradual scaling: Start with M2.5 (free), validate your workflow, upgrade to M2.7 ($0.30/$1.20), and only move to Claude or GPT if you hit a quality ceiling. This progression minimizes waste and lets you find the minimum viable model for your specific use case.

For a broader comparison of free model options, see Free API Models for OpenClaw.


Limitations

MiniMax M2 has real limitations that you should evaluate before making it your primary agent backend:

  • Complex reasoning gap: While M2.7 scores within 1-3% of Claude on benchmarks, the gap widens on tasks that require deep multi-step reasoning, abstract thinking, or understanding subtle context. For agent tasks that involve complex decision-making chains, Claude or GPT-4 are still meaningfully better.
  • Creative writing: M2 models produce functional English prose but lack the natural fluency of Claude for creative, editorial, or user-facing content. If your agent generates customer-facing text, you may notice a quality difference.
  • Vision is limited: M2.7 has basic vision capabilities, but they are not as robust as Claude's or GPT-4's image understanding. For agents that need to process screenshots, diagrams, or visual documents, a multimodal-first model is the better choice.
  • Free tier rate limits: M2.5's free OpenRouter tier has rate limits that can throttle high-volume usage. For production agents handling more than a few hundred requests per day, you will need to upgrade to M2.7 or face intermittent delays.
  • Smaller ecosystem: MiniMax has fewer community resources, tutorials, and third-party integrations compared to Llama, Gemma, or Qwen. If you run into issues, you may have fewer places to find answers.

For strategies to minimize API costs across all models, see Cheapest Way to Run OpenClaw.


Frequently Asked Questions

Is MiniMax M2.5 really free on OpenRouter?

Yes. MiniMax M2.5 is available at zero cost on OpenRouter with no API key billing. It is rate-limited and slightly less capable than M2.7, but for testing workflows or low-volume agent tasks, it works well. You can start with M2.5 for free and upgrade to M2.7 when you need the extra performance.

How does MiniMax M2.7 compare to Claude Sonnet?

M2.7 scores 78% on SWE-bench Verified — roughly on par with Claude Sonnet 4. Where it differs is cost: M2.7 runs at $0.30 per million input tokens on OpenRouter versus $3.00 for Claude Sonnet. That means you get approximately 90% of the quality at about 7% of the cost. For coding and structured tasks, M2.7 is a strong alternative. For nuanced reasoning and creative writing, Claude still leads.

Can I use MiniMax with OpenClaw's OAuth plugin system?

Yes. MiniMax supports OAuth-based plugin authentication, which means you can connect it to OpenClaw's plugin system for tool use, web browsing, and external API calls. Configure the OAuth credentials in your OpenClaw plugin settings and MiniMax will authenticate automatically when invoking tools.

What is the difference between M2, M2.5, and M2.7?

M2 was the original release — a 230B total parameter MoE model with 10B active. M2.5 was a post-training improvement with better instruction following and tool use, offered free on OpenRouter. M2.7 is the latest version with the highest benchmark scores (78% SWE-bench), improved reasoning, and paid pricing at $0.30/$1.20 per million tokens. Each version builds on the same base architecture with incremental improvements.


Further Reading