The MiniMax-M2 Series: Mini Activations Unleashing Max Real-World Intelligence

📅 2026-05-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses how to achieve state-of-the-art performance on real-world tasks with language models while activating only a small fraction of their parameters. The authors propose a sparsely activated Mixture-of-Experts architecture integrated with agent-driven data generation, a native reinforcement learning system named Forge, and a self-evolution mechanism, effectively decoupling training, inference, and agent behavior. Leveraging window-based FIFO scheduling, prefix tree merging, and inference optimization techniques, the model activates just 9.8 billion parameters out of a total of 229.9 billion, attaining leading performance across benchmarks in agent programming, deep search, office automation, and reasoning. This approach also demonstrates initial capabilities in autonomous debugging and self-evolution.
📝 Abstract
We introduce the MiniMax-M2 series, a family of Mixture-of-Experts language models built around the principle that mini activations can unleash maximum real-world intelligence. The flagship M2 contains 229.9B total parameters with only 9.8B activated per token. Designed end-to-end for agentic deployment, the M2 series rests on three components: (i) agent-driven data pipelines producing large-scale, verifiable trajectories across agentic coding and agentic cowork, each grounded in an executable workspace and an artifact-aligned reward; (ii) Forge, a scalable agent-native RL system that adapts to long-horizon agent trajectories, paired with windowed-FIFO scheduling, prefix-tree merging, inference optimization, and a clean training-inference-agent decoupling that supports both white-box and black-box agents; (iii) the latest M2.7 checkpoint takes an early step toward self-evolution -- autonomously debugging training runs and modifying its own scaffold. Across M2 through M2.7, this combination translates a mini-activation footprint into frontier-tier performance on agentic coding, deep search, office-task, and reasoning benchmarks.
Problem

Research questions and friction points this paper is trying to address.

Mixture-of-Experts
mini activations
agentic intelligence
real-world performance
parameter efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

mini activations
Mixture-of-Experts
agentic deployment
agent-native RL
self-evolution
Aili Chen
Aili Chen
Fudan University
Large Language ModelReasoning and PlanningLanguage AgentLLM Personalization
A
Aonian Li
Baichuan Zhou
Baichuan Zhou
University of Waterloo
Computer VisionNatural Language Processing
B
Bangwei Gong
B
Binyang Jiang
B
Boji Dan
C
Changqing Yu
C
Chao Wang
C
Cheng Ma
C
Cheng Zhong
Cheng Zhu
Cheng Zhu
J. Erskine Love Jr. Endowed Chair in Engineering and Regents' Professor
BiomechanicsMechanobiologyImmunologyCancerHemostasis and Thrombosis
C
Chengjun Xiao
C
Chengyi Yang
Chengyu Du
Chengyu Du
Fudan Univerity
LLMAgentRL
C
Chenyang Zhang
C
Chi Zhang
C
Chuangyi Huang
C
Chunhao Zhang
C
Chunhui Du
C
Chunyu Zhao
C
Congchao Guo
D
Da Chen
D
Deming Ding