Model Comparison

Codestral-22B vs Gemma 3 4BWhich is better in 2026?

Codestral-22B significantly outperforms across most benchmarks.

Verdict: Codestral-22B vs Gemma 3 4B — which is better?

Codestral-22B (by Mistral AI) and Gemma 3 4B (by Google) are two of the AI models people compare most. Here is how they stack up on benchmarks, price and capabilities, and which one to pick in 2026.

Codestral-22B outperforms in 2 benchmarks (HumanEval, MBPP), while Gemma 3 4B is better at 0 benchmarks. Codestral-22B significantly outperforms across most benchmarks.

Choose Codestral-22B if…

you want the strongest raw capability — it leads on 2 of 2 shared benchmarks

Choose Gemma 3 4B if…

you want the most recent training data — it shipped Mar 2025

Want to compare interactively?Try the playground

Performance Benchmarks

Comparative analysis across standard metrics

2 benchmarks

Codestral-22B outperforms in 2 benchmarks (HumanEval, MBPP), while Gemma 3 4B is better at 0 benchmarks.

Codestral-22B significantly outperforms across most benchmarks.

Thu Jun 11 2026 • llm-stats.com

Arena Performance

Human preference votes

Model Size

Parameter count comparison

18.2B diff

Codestral-22B has 18.2B more parameters than Gemma 3 4B, making it 455.0% larger.

Codestral-22B

22.2Bparameters

Gemma 3 4B

4.0Bparameters

22.2B

Codestral-22B

4.0B

Gemma 3 4B

Context Window

Maximum input and output token capacity

Only Gemma 3 4B specifies input context (131,072 tokens). Only Gemma 3 4B specifies output context (131,072 tokens).

Codestral-22B

Input- tokens

Output- tokens

Gemma 3 4B

Input131,072 tokens

Output131,072 tokens

Thu Jun 11 2026 • llm-stats.com

Input Capabilities

Supported data types and modalities

Gemma 3 4B supports multimodal inputs, whereas Codestral-22B does not.

Gemma 3 4B can handle both text and other forms of data like images, making it suitable for multimodal applications.

Codestral-22B

Text

Images

Audio

Video

Gemma 3 4B

Text

Images

Audio

Video

License

Usage and distribution terms

Codestral-22B is licensed under MNPL-0.1, while Gemma 3 4B uses Gemma.

License differences may affect how you can use these models in commercial or open-source projects.

Codestral-22B

MNPL-0.1

Open weights

Gemma 3 4B

Gemma

Open weights

Release Timeline

When each model was launched

Codestral-22B was released on 2024-05-29, while Gemma 3 4B was released on 2025-03-12.

Gemma 3 4B is 10 months newer than Codestral-22B.

Codestral-22B

May 29, 2024

2.0 years ago

Gemma 3 4B

Mar 12, 2025

1.2 years ago

9mo newer

Knowledge Cutoff

When training data ends

Gemma 3 4B has a documented knowledge cutoff of 2024-08-01, while Codestral-22B's cutoff date is not specified.

We can confirm Gemma 3 4B's training data extends to 2024-08-01, but cannot make a direct comparison without Codestral-22B's cutoff date.

Codestral-22B

—

Gemma 3 4B

Aug 2024

Outputs Comparison

Notice missing or incorrect data?Start an Issue discussion→

Key Takeaways

Codestral-22B

View details

Mistral AI

Higher HumanEval score (81.1% vs 71.3%)

Higher MBPP score (78.2% vs 63.2%)

Gemma 3 4B

View details

Google

Larger context window (131,072 tokens)

Supports multimodal inputs

Detailed Comparison

AI Model Comparison Table
Feature	Codestral-22B	Gemma 3 4B

FAQ

Common questions about Codestral-22B vs Gemma 3 4B.

Which is better, Codestral-22B or Gemma 3 4B?

Codestral-22B significantly outperforms across most benchmarks. Codestral-22B is made by Mistral AI and Gemma 3 4B is made by Google. The best choice depends on your use case — compare their benchmark scores, pricing, and capabilities above.

How does Codestral-22B compare to Gemma 3 4B in benchmarks?

Codestral-22B scores HumanEvalFIM-Average: 91.6%, HumanEval: 81.1%, MBPP: 78.2%, Spider: 63.5%, HumanEval-Average: 61.5%. Gemma 3 4B scores IFEval: 90.2%, GSM8k: 89.2%, DocVQA: 75.8%, MATH: 75.6%, AI2D: 74.8%.

What are the context window sizes for Codestral-22B and Gemma 3 4B?

Codestral-22B supports an unknown number of tokens and Gemma 3 4B supports 131K tokens. A larger context window lets you process longer documents, conversations, or codebases in a single request.

What are the main differences between Codestral-22B and Gemma 3 4B?

Key differences include multimodal support (no vs yes), licensing (MNPL-0.1 vs Gemma). See the full comparison above for benchmark-by-benchmark results.

Who makes Codestral-22B and Gemma 3 4B?

Codestral-22B is developed by Mistral AI and Gemma 3 4B is developed by Google.

Codestral-22B vs Gemma 3 4BWhich is better in 2026?

Verdict: Codestral-22B vs Gemma 3 4B — which is better?

Choose Codestral-22B if…

Choose Gemma 3 4B if…

Performance Benchmarks

Arena Performance

Model Size

Context Window

Input Capabilities

Codestral-22B

Gemma 3 4B

License

Release Timeline

Knowledge Cutoff

Outputs Comparison

Key Takeaways

Codestral-22B

Gemma 3 4B

Detailed Comparison

FAQ

Which is better, Codestral-22B or Gemma 3 4B?

How does Codestral-22B compare to Gemma 3 4B in benchmarks?

What are the context window sizes for Codestral-22B and Gemma 3 4B?

What are the main differences between Codestral-22B and Gemma 3 4B?

Who makes Codestral-22B and Gemma 3 4B?

More Codestral-22B comparisons

More Gemma 3 4B comparisons

Codestral-22B vs Gemma 3 4BWhich is better in 2026?

Verdict: Codestral-22B vs Gemma 3 4B — which is better?

Choose Codestral-22B if…

Choose Gemma 3 4B if…

Performance Benchmarks

Arena Performance

Model Size

Context Window

Input Capabilities

Codestral-22B

Gemma 3 4B

License

Release Timeline

Knowledge Cutoff

Outputs Comparison

Key Takeaways

Codestral-22B

Gemma 3 4B

Detailed Comparison

Which is better, Codestral-22B or Gemma 3 4B?

How does Codestral-22B compare to Gemma 3 4B in benchmarks?

What are the context window sizes for Codestral-22B and Gemma 3 4B?

What are the main differences between Codestral-22B and Gemma 3 4B?

Who makes Codestral-22B and Gemma 3 4B?

Related comparisons

More Codestral-22B comparisons

More Gemma 3 4B comparisons