XiaomiReleased on Mar 18, 2026

MiMo-V2-Omni: Benchmarks, Pricing & Context Window

MiMo-V2-Omni is a language model from Xiaomi, released in March 2026, with multimodal input.

MiMo-V2-Omni is Xiaomi's omni foundation model uniting frontier multimodal understanding with strong agentic capability. It fuses dedicated image, video, and audio encoders into a single shared backbone, processing all modalities

Input
TextImageAudioVideo
Output
Text

MiMo-V2-Omni pricing

Providers

MiMo-V2-Omni starts at $0.400 per million input tokens and $2.00 per million output tokens via Xiaomi.

ProviderInput $/MOutput $/MMax InputMax OutputLatency p95 sThroughput P95QuantInputOutput
Xiaomi logoXiaomi
$0.400$2.00262.0K16.4K

MiMo-V2-Omni API

API access coming soon

MiMo-V2-Omni will be available through our gateway shortly.

MiMo-V2-Omni examples

Recent arena outputs from MiMo-V2-Omni, picked from the highest-ranked matchups.

MiMo-V2-Omni license

MiMo-V2-Omni is released under the Proprietary license, which restricts commercial use.

License
Proprietary
Non-commercial

Proprietary license - usage restrictions apply

FAQ

Common questions about MiMo-V2-Omni.

When was MiMo-V2-Omni released?

MiMo-V2-Omni was released on March 18, 2026 by Xiaomi. This is the official MiMo-V2-Omni release date tracked on LLM Stats.

Is MiMo-V2-Omni available via API?

Yes, MiMo-V2-Omni is available via API. See the official documentation for authentication and endpoint details. It is served by 1 provider tracked on LLM Stats.

Who created MiMo-V2-Omni?

MiMo-V2-Omni was created by Xiaomi.

What is the license for MiMo-V2-Omni?

MiMo-V2-Omni is released under the Proprietary license.

Is MiMo-V2-Omni multimodal?

Yes, MiMo-V2-Omni is multimodal and can accept both text and images as input.

Where can I use MiMo-V2-Omni?

MiMo-V2-Omni is available through 1 provider including Xiaomi.

Where is the MiMo-V2-Omni paper or technical report?

MiMo-V2-Omni has a paper or technical report available at https://mimo.xiaomi.com/mimo-v2-omni. Use that source for architecture, training, release and evaluation details.