Model Comparison

Codestral-22B vs Phi-3.5-mini-instruct

Codestral-22B significantly outperforms across most benchmarks.

Want to compare interactively?Try the playground

Performance Benchmarks

Comparative analysis across standard metrics

2 benchmarks

Codestral-22B outperforms in 2 benchmarks (HumanEval, MBPP), while Phi-3.5-mini-instruct is better at 0 benchmarks.

Codestral-22B significantly outperforms across most benchmarks.

Fri Jun 05 2026 • llm-stats.com

Arena Performance

Human preference votes

Model Size

Parameter count comparison

18.4B diff

Codestral-22B has 18.4B more parameters than Phi-3.5-mini-instruct, making it 484.2% larger.

Codestral-22B

22.2Bparameters

Phi-3.5-mini-instruct

3.8Bparameters

22.2B

Codestral-22B

3.8B

Phi-3.5-mini-instruct

Context Window

Maximum input and output token capacity

Only Phi-3.5-mini-instruct specifies input context (128,000 tokens). Only Phi-3.5-mini-instruct specifies output context (128,000 tokens).

Codestral-22B

Input- tokens

Output- tokens

Phi-3.5-mini-instruct

Input128,000 tokens

Output128,000 tokens

Fri Jun 05 2026 • llm-stats.com

License

Usage and distribution terms

Codestral-22B is licensed under MNPL-0.1, while Phi-3.5-mini-instruct uses MIT.

License differences may affect how you can use these models in commercial or open-source projects.

Codestral-22B

MNPL-0.1

Open weights

Phi-3.5-mini-instruct

MIT

Open weights

Release Timeline

When each model was launched

Codestral-22B was released on 2024-05-29, while Phi-3.5-mini-instruct was released on 2024-08-23.

Phi-3.5-mini-instruct is 3 months newer than Codestral-22B.

Codestral-22B

May 29, 2024

2.0 years ago

Phi-3.5-mini-instruct

Aug 23, 2024

1.8 years ago

2mo newer

Knowledge Cutoff

When training data ends

Neither model specifies a knowledge cutoff date.

Unable to compare the recency of their training data.

No cutoff dates available

Outputs Comparison

Notice missing or incorrect data?Start an Issue discussion→

Key Takeaways

Codestral-22B

View details

Mistral AI

Higher HumanEval score (81.1% vs 62.8%)

Higher MBPP score (78.2% vs 69.6%)

Phi-3.5-mini-instruct

View details

Microsoft

Larger context window (128,000 tokens)

Detailed Comparison

AI Model Comparison Table
Feature	Codestral-22B	Phi-3.5-mini-instruct

FAQ

Common questions about Codestral-22B vs Phi-3.5-mini-instruct.

Which is better, Codestral-22B or Phi-3.5-mini-instruct?

Codestral-22B significantly outperforms across most benchmarks. Codestral-22B is made by Mistral AI and Phi-3.5-mini-instruct is made by Microsoft. The best choice depends on your use case — compare their benchmark scores, pricing, and capabilities above.

How does Codestral-22B compare to Phi-3.5-mini-instruct in benchmarks?

Codestral-22B scores HumanEvalFIM-Average: 91.6%, HumanEval: 81.1%, MBPP: 78.2%, Spider: 63.5%, HumanEval-Average: 61.5%. Phi-3.5-mini-instruct scores GSM8k: 86.2%, ARC-C: 84.6%, RULER: 84.1%, PIQA: 81.0%, OpenBookQA: 79.2%.

What are the context window sizes for Codestral-22B and Phi-3.5-mini-instruct?

Codestral-22B supports an unknown number of tokens and Phi-3.5-mini-instruct supports 128K tokens. A larger context window lets you process longer documents, conversations, or codebases in a single request.

What are the main differences between Codestral-22B and Phi-3.5-mini-instruct?

Key differences include licensing (MNPL-0.1 vs MIT). See the full comparison above for benchmark-by-benchmark results.

Who makes Codestral-22B and Phi-3.5-mini-instruct?

Codestral-22B is developed by Mistral AI and Phi-3.5-mini-instruct is developed by Microsoft.