MiniMax Releases M2.5 AI Models Promising Near Frontier Performance at a Fraction of the Cost

Key Takeaway

MiniMax has launched the M2.5 and M2.5-Lightning language models, claiming near state-of-the-art performance while cutting costs by up to 95% compared with leading models such as Claude Opus 4.6. New details also reinforce that the models are aimed at coding, search, tool use, and automation-heavy agent workflows.

MiniMax Releases M2.5 AI Models Promising Near Frontier Performance at a Fraction of the Cost (Credit - Midjourney, The AI Track)
MiniMax Releases M2.5 AI Models Promising Near Frontier Performance at a Fraction of the Cost (Credit - Midjourney, The AI Track)

MiniMax M2.5 and M2.5-Lightning models – Key Points

The Story

Shanghai-based AI startup MiniMax has introduced two new language models, M2.5 and M2.5-Lightning, designed to deliver high-end AI performance at dramatically lower cost. The company says the models approach the performance of top systems from Google and Anthropic while enabling large-scale agentic workflows for enterprise tasks. New details indicate M2.5 is MiniMax’s third M2-series release in 108 days, and external benchmarking from OpenHands has reinforced the case that the model is especially notable not just for raw capability, but for how cheaply it can run long software-engineering workloads. Additional demonstrations in the update also position M2.5 as a practical model for producing office documents, research tasks, presentations, and landing pages through MiniMax’s own agent product.

The Facts

  • Two model variants released

    MiniMax launched M2.5 and M2.5-Lightning, both available through API access and designed for large-scale production workloads. MiniMax says the two versions have the same core capability, with the main differences being speed and price.

  • MiniMax is iterating quickly

    M2.5 is the company’s third M2-series iteration in 108 days, following M2 and M2.1.

  • Near frontier-level performance claims

    The company states the models approach the performance of top-tier models such as Claude Opus 4.6, placing them in the current top tier of coding-oriented AI systems. The update also says MiniMax benchmarked M2.5 as competitive with Gemini 3 Pro and GPT-5.2, though not against GPT-5.3.

  • The model is positioned for coding and agentic work

    The new material describes M2.5 as built for high-throughput, low-latency production environments, especially tasks involving coding, automation, search, tool use, and multi-step office workflows.

  • Mixture-of-Experts architecture reduces compute cost

    M2.5 uses a Mixture-of-Experts (MoE) design with 230 billion parameters, but only 10 billion parameters activate per token, allowing high reasoning capacity with lower computational overhead.

  • Reinforcement learning framework called Forge

    MiniMax trained the model using a proprietary RL framework called Forge, which exposes the model to simulated work environments where it practices coding and tool use. The new material says MiniMax has built hundreds of thousands of training environments from tasks and workspaces used inside the company.

  • Two months of training reported

    MiniMax engineer Olive Song stated that the system was trained over approximately two months, focusing heavily on reinforcement learning across diverse environments.

  • CISPO method stabilizes training

    The training pipeline incorporates Clipping Importance Sampling Policy Optimization (CISPO), a mathematical approach designed to prevent unstable policy updates during reinforcement learning.

  • Training system reportedly speeds up RL

    MiniMax uses asynchronous scheduling and a tree-structured merging strategy to balance fresh and older experiences during training. According to the update, this delivers a claimed 40× training speedup over a simpler generate-then-train loop.

  • Architect-style reasoning approach

    According to MiniMax, M2.5 tends to plan project structures and features before writing code, which the company describes as an “Architect Mindset.” The new demonstrations also describe strong planning behavior when building structured outputs such as presentations and landing pages.

  • Internal deployment inside MiniMax

    The company reports that 30% of internal tasks are already handled by M2.5, and 80% of newly committed code at the company is generated by the model.

  • Strong benchmark results reported by MiniMax

    • SWE-Bench Verified: 80.2%
    • BrowseComp: 76.3%
    • Multi-SWE-Bench: 51.3%
    • BFCL tool-calling benchmark: 76.8%
  • Two performance tiers available

    • M2.5-Lightning: ~100 tokens per second
    • M2.5 standard: ~50 tokens per second
  • API pricing designed for high-volume usage

    • M2.5-Lightning: $0.30 per 1M input tokens / $2.40 output tokens
    • M2.5 standard: $0.15 per 1M input tokens / $1.20 output tokens
  • Major cost difference compared with leading models

    MiniMax claims typical tasks cost about $0.15 versus roughly $3.00 for Claude Opus 4.6, suggesting a potential 10–20× price difference versus competing proprietary models. The update also describes MiniMax’s marketing pitch as roughly $1 per hour to run the faster model continuously at about 100 tokens per second, versus roughly $15–$20 per hour for Claude Opus.

Benchmarks / Evidence Check

MiniMax reports strong results on several industry benchmarks:

  • SWE-Bench Verified: 80.2% (reported by MiniMax)
  • BrowseComp: 76.3% (reported by MiniMax)
  • Multi-SWE-Bench: 51.3% (reported by MiniMax)
  • BFCL tool-calling benchmark: 76.8% (reported by MiniMax)

Numbers that Matter

  • 230B parameters total model size
  • 10B active parameters per token (MoE activation)
  • Hundreds of thousands of RL environments used in training, according to the update
  • 40× training speedup claimed for Forge training orchestration
  • $1 per hour claimed continuous runtime for M2.5-Lightning at about 100 tokens per second
  • 30% of internal tasks automated at MiniMax
  • 80% of new code commits generated by the model

Risks / Limitations

  • Open-source status remains unclear

    MiniMax described the model as open source, but weights, code, and license terms have not yet been released. The update says the model is not open weights yet, even if some serving partners already appear to have access.

  • Benchmark claims are partly company-reported

    Performance results cited originate mainly from MiniMax benchmarks rather than independent third-party testing, though the OpenHands evaluation adds an external data point for coding-agent use cases.

Why This Matters

MiniMax’s release signals a potential shift in the economics of AI development. If high-performance models become dramatically cheaper to run, developers may move beyond simple chatbot applications and deploy long-running AI agents capable of coding, researching, generating office documents, building presentations, and managing workflows continuously without prohibitive costs.


This article was drafted with the assistance of generative AI. All facts and details were reviewed and confirmed by an editor prior to publication.

China announces five-year plan for 2026–2030 with AI infrastructure, semiconductors, digital economy growth, and tighter focus on orderly AI development.

Alibaba plans to invest $53B in AI and cloud infrastructure over three years, aiming for AGI leadership and global cloud dominance, surpassing past decade’s spending.

Baidu says Kunlunxin filed for a Hong Kong listing on Jan. 1, 2026, as China accelerates domestic AI chips and Hong Kong IPOs rebound.

Read a comprehensive monthly roundup of the latest AI news!

The AI Track News: In-Depth And Concise

Scroll to Top