MiMo-V2-Flash
Overview
MiMo-V2-Flash is a powerful, efficient, and ultra-fast foundation language model that excels in reasoning, coding, and agentic scenarios. It is a Mixture-of-Experts model with 309B total parameters and 15B active parameters, featuring a hybrid attention architecture with sliding-window and full attention (5:1 ratio, 128-token window). Delivers 150 tokens/sec inference with 256k context window.
MiMo-V2-Flash was released on December 16, 2025. API access is available through Xiaomi.
Performance
Timeline
Other Details
Related Models
Compare MiMo-V2-Flash to other models by quality (GPQA score) vs cost. Higher scores and lower costs represent better value.
Performance visualization loading...
Gathering benchmark data from similar models
Benchmarks
MiMo-V2-Flash Performance Across Datasets
Scores sourced from the model's scorecard, paper, or official blog posts
Pricing
Pricing, performance, and capabilities for MiMo-V2-Flash across different providers:
| Provider | Input ($/M) | Output ($/M) | Max Input | Max Output | Latency (s) | Throughput | Quantization | Input | Output |
|---|---|---|---|---|---|---|---|---|---|
Xiaomi | $0.10 | $0.30 | 256.0K | 16.4K | — | — | — | Text Image Audio Video | Text Image Audio Video |
Example Outputs
Recent Posts
Recent Reviews
API Access
API Access Coming Soon
API access for MiMo-V2-Flash will be available soon through our gateway.
FAQ
Common questions about MiMo-V2-Flash
