XiaomiReleased on Apr 22, 2026

MiMo-V2.5: Benchmarks, Pricing & Context Window

MiMo-V2.5 is a language model from Xiaomi, released in April 2026, with multimodal input.

MiMo-V2.5 is Xiaomi's native omnimodal sparse Mixture-of-Experts model with 310B total parameters, 15B activated parameters, and a 1M-token context window. Built on the MiMo-V2-Flash backbone, it adds dedicated vision and audio encoders

Input
TextImageAudioVideo
Output
Text

MiMo-V2.5 pricing

Providers

MiMo-V2.5 starts at $0.168 per million input tokens and $0.336 per million output tokens via Novita. See all 2 providers below with their per-token pricing, latency, throughput, and modality support.

ProviderInput $/MOutput $/MMax InputMax OutputLatency sThroughputQuantInputOutput
Novita logoNovita
$0.168$0.3361.0M131.1K
DeepInfra logoDeepInfra
$0.400$2.00262.1K131.1K
fp8
Loading chart...
Loading chart...
Loading chart...

MiMo-V2.5 API

API access coming soon

MiMo-V2.5 will be available through our gateway shortly.

MiMo-V2.5 examples

Recent arena outputs from MiMo-V2.5, picked from the highest-ranked matchups.

MiMo-V2.5 license

MiMo-V2.5 is released under the MIT license, which permits commercial use, has 310.8B parameters.

License
MIT
Commercial use allowed
Parameters
310.8B

MIT License - allows commercial use

FAQ

Common questions about MiMo-V2.5.

What is the MiMo-V2.5 release date?

MiMo-V2.5 was released on April 22, 2026 by Xiaomi.

Who created MiMo-V2.5?

MiMo-V2.5 was created by Xiaomi.

How many parameters does MiMo-V2.5 have?

MiMo-V2.5 has 310.8 billion parameters.

What is the license for MiMo-V2.5?

MiMo-V2.5 is released under the MIT license. This is an open-source/open-weight license.

Is MiMo-V2.5 multimodal?

Yes, MiMo-V2.5 is a multimodal model that can process both text and images as input.