Pixtral-12B
Overview
Overview
A 12B parameter multimodal model with a 400M parameter vision encoder, capable of understanding both natural images and documents. Excels at multimodal tasks while maintaining strong text-only performance. Supports variable image sizes and multiple images in context.
Pixtral-12B was released on September 17, 2024. API access is available through Mistral AI.
Performance
Timeline
ReleasedUnknown
Knowledge CutoffUnknown
Specifications
Parameters
12.4B
License
Apache 2.0
Training Data
Unknown
Benchmarks
Benchmarks
Pixtral-12B Performance Across Datasets
Scores sourced from the model's scorecard, paper, or official blog posts
Notice missing or incorrect data?Start an Issue discussion→
Pricing
Pricing
Pricing, performance, and capabilities for Pixtral-12B across different providers:
| Provider | Input ($/M) | Output ($/M) | Max Input | Max Output | Latency (s) | Throughput | Quantization | Input | Output |
|---|---|---|---|---|---|---|---|---|---|
Mistral AI | $0.15 | $0.15 | 128.0K | 8.2K | 0.5 | 0.1 c/s | — | Text Image Audio Video | Text Image Audio Video |
API Access
API Access Coming Soon
API access for Pixtral-12B will be available soon through our gateway.
Recent Posts
Recent Reviews
FAQ
Common questions about Pixtral-12B
Pixtral-12B was released on September 17, 2024 by Mistral.
Pixtral-12B was created by Mistral.
Pixtral-12B has 12.4 billion parameters.
Pixtral-12B is released under the Apache 2.0 license. This is an open-source/open-weight license.
Yes, Pixtral-12B is a multimodal model that can process both text and images as input.