Model Comparison

Magistral Medium vs Mistral Large 3 (675B Instruct 2512)

Magistral Medium significantly outperforms across most benchmarks.

Want to compare interactively?Try the playground

Performance Benchmarks

Comparative analysis across standard metrics

2 benchmarks

Magistral Medium outperforms in 2 benchmarks (GPQA, LiveCodeBench), while Mistral Large 3 (675B Instruct 2512) is better at 0 benchmarks.

Magistral Medium significantly outperforms across most benchmarks.

Sun May 17 2026 • llm-stats.com

Arena Performance

Human preference votes

Model Size

Parameter count comparison

651.0B diff

Mistral Large 3 (675B Instruct 2512) has 651.0B more parameters than Magistral Medium, making it 2712.5% larger.

Magistral Medium

24.0Bparameters

Mistral Large 3 (675B Instruct 2512)

675.0Bparameters

24.0B

Magistral Medium

675.0B

Mistral Large 3 (675B Instruct 2512)

Context Window

Maximum input and output token capacity

Only Mistral Large 3 (675B Instruct 2512) specifies input context (262,100 tokens). Only Mistral Large 3 (675B Instruct 2512) specifies output context (262,100 tokens).

Magistral Medium

Input- tokens

Output- tokens

Mistral Large 3 (675B Instruct 2512)

Input262,100 tokens

Output262,100 tokens

Sun May 17 2026 • llm-stats.com

Input Capabilities

Supported data types and modalities

Both Magistral Medium and Mistral Large 3 (675B Instruct 2512) support multimodal inputs.

They are both capable of processing various types of data, offering versatility in application.

Magistral Medium

Text

Images

Audio

Video

Mistral Large 3 (675B Instruct 2512)

Text

Images

Audio

Video

License

Usage and distribution terms

Both models are licensed under Apache 2.0.

Both models share the same licensing terms, providing consistent usage rights.

Magistral Medium

Apache 2.0

Open weights

Mistral Large 3 (675B Instruct 2512)

Apache 2.0

Open weights

Release Timeline

When each model was launched

Magistral Medium was released on 2025-06-10, while Mistral Large 3 (675B Instruct 2512) was released on 2025-12-04.

Mistral Large 3 (675B Instruct 2512) is 6 months newer than Magistral Medium.

Magistral Medium

Jun 10, 2025

11 months ago

Mistral Large 3 (675B Instruct 2512)

Dec 4, 2025

5 months ago

5mo newer

Knowledge Cutoff

When training data ends

Magistral Medium has a documented knowledge cutoff of 2025-06-01, while Mistral Large 3 (675B Instruct 2512)'s cutoff date is not specified.

We can confirm Magistral Medium's training data extends to 2025-06-01, but cannot make a direct comparison without Mistral Large 3 (675B Instruct 2512)'s cutoff date.

Magistral Medium

Jun 2025

Mistral Large 3 (675B Instruct 2512)

—

Outputs Comparison

Notice missing or incorrect data?Start an Issue discussion→

Key Takeaways

Magistral Medium

View details

Mistral AI

Higher GPQA score (70.8% vs 43.9%)

Higher LiveCodeBench score (50.3% vs 34.4%)

Mistral Large 3 (675B Instruct 2512)

View details

Mistral AI

Larger context window (262,100 tokens)

Detailed Comparison

AI Model Comparison Table
Feature	Magistral Medium	Mistral Large 3 (675B Instruct 2512)

FAQ

Common questions about Magistral Medium vs Mistral Large 3 (675B Instruct 2512).

Which is better, Magistral Medium or Mistral Large 3 (675B Instruct 2512)?

Magistral Medium significantly outperforms across most benchmarks. Magistral Medium is made by Mistral AI and Mistral Large 3 (675B Instruct 2512) is made by Mistral AI. The best choice depends on your use case — compare their benchmark scores, pricing, and capabilities above.

How does Magistral Medium compare to Mistral Large 3 (675B Instruct 2512) in benchmarks?

Magistral Medium scores AIME 2024: 73.6%, GPQA: 70.8%, AIME 2025: 64.9%, LiveCodeBench: 50.3%, Aider-Polyglot: 47.1%. Mistral Large 3 (675B Instruct 2512) scores MMMLU: 85.5%, AMC_2022_23: 52.0%, GPQA: 43.9%, LiveCodeBench: 34.4%, SimpleQA: 23.8%.

What are the context window sizes for Magistral Medium and Mistral Large 3 (675B Instruct 2512)?

Magistral Medium supports an unknown number of tokens and Mistral Large 3 (675B Instruct 2512) supports 262K tokens. A larger context window lets you process longer documents, conversations, or codebases in a single request.