MiniMax M2
Overview
MiniMax M2 is an open-source large language model by MiniMax, built for agents and coding tasks. It delivers state-of-the-art tool use, reasoning, and search performance while maintaining exceptional cost-efficiency and speed, priced at just 8% of Claude 3.5 Sonnet’s cost and running at nearly double its inference speed (≈100 TPS). Designed for end-to-end agentic workflows, it excels at long-chain tool calling across Shell, Browser, Python, and other MCP tools. While slightly behind top overseas models in programming, it ranks among the best domestic models and top five globally on the Artificial Analysis benchmark. M2 powers the MiniMax Agent platform, available in Lightning Mode for fast tasks and Pro Mode for complex multi-step reasoning, and its weights, API, and deployment guides are freely available on Hugging Face, vLLM, and SGLang.
MiniMax M2 was released on October 27, 2025. API access is available through MiniMax, Novita.
Performance
Timeline
Specifications
Benchmarks
MiniMax M2 Performance Across Datasets
Scores sourced from the model's scorecard, paper, or official blog posts
Pricing
Pricing, performance, and capabilities for MiniMax M2 across different providers:
| Provider | Input ($/M) | Output ($/M) | Max Input | Max Output | Latency (s) | Throughput | Quantization | Input | Output |
|---|---|---|---|---|---|---|---|---|---|
MiniMax | $0.30 | $1.20 | 1.0M | 1.0M | 4.0 | 70.0 c/s | — | Text Image Audio Video | Text Image Audio Video |
Novitabf16 | $0.30 | $1.20 | 204.8K | 131.1K | — | — | bf16 | Text Image Audio Video | Text Image Audio Video |
Price Comparison for MiniMax M2
Price per 1M input tokens (USD), lower is better
Throughput Comparison for MiniMax M2
Tokens per second, higher is better
Latency Comparison for MiniMax M2
Time to first token (s), lower is better
API Access
API Access Coming Soon
API access for MiniMax M2 will be available soon through our gateway.
Recent Posts
Recent Reviews
FAQ
Common questions about MiniMax M2
