Mistral logo

Pixtral-12B

Overview

Overview

A 12B parameter multimodal model with a 400M parameter vision encoder, capable of understanding both natural images and documents. Excels at multimodal tasks while maintaining strong text-only performance. Supports variable image sizes and multiple images in context.

Pixtral-12B was released on September 17, 2024. API access is available through Mistral AI.

Performance

Timeline

ReleasedUnknown
Knowledge CutoffUnknown

Specifications

Parameters
12.4B
License
Apache 2.0
Training Data
Unknown

Benchmarks

Benchmarks

Pixtral-12B Performance Across Datasets

Scores sourced from the model's scorecard, paper, or official blog posts

LLM Stats Logollm-stats.com - Fri Jan 30 2026
Notice missing or incorrect data?Start an Issue discussion

Pricing

Pricing

Pricing, performance, and capabilities for Pixtral-12B across different providers:

ProviderInput ($/M)Output ($/M)Max InputMax OutputLatency (s)ThroughputQuantizationInputOutput
Mistral AI logo
Mistral AI
$0.15$0.15128.0K8.2K
0.5
0.1 c/s
Text
Image
Audio
Video
Text
Image
Audio
Video

API Access

API Access Coming Soon

API access for Pixtral-12B will be available soon through our gateway.

Recent Posts

Recent Reviews

FAQ

Common questions about Pixtral-12B

Pixtral-12B was released on September 17, 2024 by Mistral.
Pixtral-12B was created by Mistral.
Pixtral-12B has 12.4 billion parameters.
Pixtral-12B is released under the Apache 2.0 license. This is an open-source/open-weight license.
Yes, Pixtral-12B is a multimodal model that can process both text and images as input.