Model Comparison

Mistral Small 3.2 24B Instruct vs Phi 4 Reasoning

Phi 4 Reasoning significantly outperforms across most benchmarks.

Performance Benchmarks

Comparative analysis across standard metrics

3 benchmarks

Mistral Small 3.2 24B Instruct outperforms in 0 benchmarks, while Phi 4 Reasoning is better at 3 benchmarks (Arena Hard, GPQA, MMLU-Pro).

Phi 4 Reasoning significantly outperforms across most benchmarks.

Thu Apr 30 2026 • llm-stats.com

Arena Performance

Human preference votes

Pricing Analysis

Price comparison per million tokens

Cost data unavailable.

Lowest available price from all providers
Thu Apr 30 2026 • llm-stats.com
Mistral AI
Mistral Small 3.2 24B Instruct
Input tokens$0.00
Output tokens$0.00
Best providerUnknown Organization
Microsoft
Phi 4 Reasoning
Input tokens$0.00
Output tokens$0.00
Best providerUnknown Organization
Notice missing or incorrect data?Start an Issue

Model Size

Parameter count comparison

9.6B diff

Mistral Small 3.2 24B Instruct has 9.6B more parameters than Phi 4 Reasoning, making it 68.6% larger.

Mistral AI
Mistral Small 3.2 24B Instruct
23.6Bparameters
Microsoft
Phi 4 Reasoning
14.0Bparameters
23.6B
Mistral Small 3.2 24B Instruct
14.0B
Phi 4 Reasoning

Input Capabilities

Supported data types and modalities

Mistral Small 3.2 24B Instruct supports multimodal inputs, whereas Phi 4 Reasoning does not.

Mistral Small 3.2 24B Instruct can handle both text and other forms of data like images, making it suitable for multimodal applications.

Mistral Small 3.2 24B Instruct

Text
Images
Audio
Video

Phi 4 Reasoning

Text
Images
Audio
Video

License

Usage and distribution terms

Mistral Small 3.2 24B Instruct is licensed under Apache 2.0, while Phi 4 Reasoning uses MIT.

License differences may affect how you can use these models in commercial or open-source projects.

Mistral Small 3.2 24B Instruct

Apache 2.0

Open weights

Phi 4 Reasoning

MIT

Open weights

Release Timeline

When each model was launched

Mistral Small 3.2 24B Instruct was released on 2025-06-20, while Phi 4 Reasoning was released on 2025-04-30.

Mistral Small 3.2 24B Instruct is 2 months newer than Phi 4 Reasoning.

Mistral Small 3.2 24B Instruct

Jun 20, 2025

10 months ago

1mo newer
Phi 4 Reasoning

Apr 30, 2025

1.0 years ago

Knowledge Cutoff

When training data ends

Mistral Small 3.2 24B Instruct has a knowledge cutoff of 2023-10-01, while Phi 4 Reasoning has a cutoff of 2025-03-01.

Phi 4 Reasoning has more recent training data (up to 2025-03-01), making it potentially better informed about events through that date compared to Mistral Small 3.2 24B Instruct (2023-10-01).

Mistral Small 3.2 24B Instruct

Oct 2023

Phi 4 Reasoning

Mar 2025

1.4 yr newer

Outputs Comparison

Notice missing or incorrect data?Start an Issue discussion

Key Takeaways

Supports multimodal inputs
Higher Arena Hard score (73.3% vs 43.1%)
Higher GPQA score (65.8% vs 46.1%)
Higher MMLU-Pro score (74.3% vs 69.1%)

Detailed Comparison

AI Model Comparison Table
Feature
Mistral AI
Mistral Small 3.2 24B Instruct
Microsoft
Phi 4 Reasoning

FAQ

Common questions about Mistral Small 3.2 24B Instruct vs Phi 4 Reasoning

Phi 4 Reasoning significantly outperforms across most benchmarks. Mistral Small 3.2 24B Instruct is made by Mistral AI and Phi 4 Reasoning is made by Microsoft. The best choice depends on your use case — compare their benchmark scores, pricing, and capabilities above.
Mistral Small 3.2 24B Instruct scores DocVQA: 94.9%, AI2D: 92.9%, HumanEval Plus: 92.9%, ChartQA: 87.4%, IF: 84.8%. Phi 4 Reasoning scores FlenQA: 97.7%, HumanEval+: 92.9%, IFEval: 83.4%, OmniMath: 76.6%, AIME 2024: 75.3%.
Key differences include multimodal support (yes vs no), licensing (Apache 2.0 vs MIT). See the full comparison above for benchmark-by-benchmark results.
Mistral Small 3.2 24B Instruct is developed by Mistral AI and Phi 4 Reasoning is developed by Microsoft.