Model Comparison

Mistral Small 3.2 24B Instruct vs Phi 4 Reasoning

Phi 4 Reasoning significantly outperforms across most benchmarks.

Want to compare interactively?Try the playground

Performance Benchmarks

Comparative analysis across standard metrics

3 benchmarks

Mistral Small 3.2 24B Instruct outperforms in 0 benchmarks, while Phi 4 Reasoning is better at 3 benchmarks (Arena Hard, GPQA, MMLU-Pro).

Phi 4 Reasoning significantly outperforms across most benchmarks.

Thu Apr 30 2026 • llm-stats.com

Arena Performance

Human preference votes

Pricing Analysis

Price comparison per million tokens

Cost data unavailable.

Lowest available price from all providers

Thu Apr 30 2026 • llm-stats.com

Mistral Small 3.2 24B Instruct

Input tokens$0.00

Output tokens$0.00

Best providerUnknown Organization

Phi 4 Reasoning

Input tokens$0.00

Output tokens$0.00

Best providerUnknown Organization

Notice missing or incorrect data?Start an Issue→

Model Size

Parameter count comparison

9.6B diff

Mistral Small 3.2 24B Instruct has 9.6B more parameters than Phi 4 Reasoning, making it 68.6% larger.

Mistral Small 3.2 24B Instruct

23.6Bparameters

Phi 4 Reasoning

14.0Bparameters

23.6B

Mistral Small 3.2 24B Instruct

14.0B

Phi 4 Reasoning

Input Capabilities

Supported data types and modalities

Mistral Small 3.2 24B Instruct supports multimodal inputs, whereas Phi 4 Reasoning does not.

Mistral Small 3.2 24B Instruct can handle both text and other forms of data like images, making it suitable for multimodal applications.

Mistral Small 3.2 24B Instruct

Text

Images

Audio

Video

Phi 4 Reasoning

Text

Images

Audio

Video

License

Usage and distribution terms

Mistral Small 3.2 24B Instruct is licensed under Apache 2.0, while Phi 4 Reasoning uses MIT.

License differences may affect how you can use these models in commercial or open-source projects.

Mistral Small 3.2 24B Instruct

Apache 2.0

Open weights

Phi 4 Reasoning

MIT

Open weights

Release Timeline

When each model was launched

Mistral Small 3.2 24B Instruct was released on 2025-06-20, while Phi 4 Reasoning was released on 2025-04-30.

Mistral Small 3.2 24B Instruct is 2 months newer than Phi 4 Reasoning.

Mistral Small 3.2 24B Instruct

Jun 20, 2025

10 months ago

1mo newer

Phi 4 Reasoning

Apr 30, 2025

1.0 years ago

Knowledge Cutoff

When training data ends

Mistral Small 3.2 24B Instruct has a knowledge cutoff of 2023-10-01, while Phi 4 Reasoning has a cutoff of 2025-03-01.

Phi 4 Reasoning has more recent training data (up to 2025-03-01), making it potentially better informed about events through that date compared to Mistral Small 3.2 24B Instruct (2023-10-01).

Mistral Small 3.2 24B Instruct

Oct 2023

Phi 4 Reasoning

Mar 2025

1.4 yr newer

Outputs Comparison

Notice missing or incorrect data?Start an Issue discussion→

Key Takeaways

Mistral Small 3.2 24B Instruct

View details

Mistral AI

Supports multimodal inputs

Phi 4 Reasoning

View details

Microsoft

Higher Arena Hard score (73.3% vs 43.1%)

Higher GPQA score (65.8% vs 46.1%)

Higher MMLU-Pro score (74.3% vs 69.1%)

Detailed Comparison

AI Model Comparison Table
Feature	Mistral Small 3.2 24B Instruct	Phi 4 Reasoning

FAQ

Common questions about Mistral Small 3.2 24B Instruct vs Phi 4 Reasoning

Phi 4 Reasoning significantly outperforms across most benchmarks. Mistral Small 3.2 24B Instruct is made by Mistral AI and Phi 4 Reasoning is made by Microsoft. The best choice depends on your use case — compare their benchmark scores, pricing, and capabilities above.

Mistral Small 3.2 24B Instruct scores DocVQA: 94.9%, AI2D: 92.9%, HumanEval Plus: 92.9%, ChartQA: 87.4%, IF: 84.8%. Phi 4 Reasoning scores FlenQA: 97.7%, HumanEval+: 92.9%, IFEval: 83.4%, OmniMath: 76.6%, AIME 2024: 75.3%.

Key differences include multimodal support (yes vs no), licensing (Apache 2.0 vs MIT). See the full comparison above for benchmark-by-benchmark results.

Mistral Small 3.2 24B Instruct is developed by Mistral AI and Phi 4 Reasoning is developed by Microsoft.

Mistral Small 3.2 24B Instruct vs Phi 4 Reasoning

Performance Benchmarks

Arena Performance

Pricing Analysis

Model Size

Input Capabilities

Mistral Small 3.2 24B Instruct

Phi 4 Reasoning

License

Release Timeline

Knowledge Cutoff

Outputs Comparison

Key Takeaways

Mistral Small 3.2 24B Instruct

Phi 4 Reasoning

Detailed Comparison

FAQ

Which is better, Mistral Small 3.2 24B Instruct or Phi 4 Reasoning?

How does Mistral Small 3.2 24B Instruct compare to Phi 4 Reasoning in benchmarks?

What are the main differences between Mistral Small 3.2 24B Instruct and Phi 4 Reasoning?

Who makes Mistral Small 3.2 24B Instruct and Phi 4 Reasoning?