Model Comparison
Llama 4 Scout vs Mercury 2Which is better in 2026?
Mercury 2 significantly outperforms across most benchmarks. Llama 4 Scout is 2.8x cheaper per token.
Verdict: Llama 4 Scout vs Mercury 2 — which is better?
Llama 4 Scout (by Meta) and Mercury 2 (by Inception) are two of the AI models people compare most. Here is how they stack up on benchmarks, price and capabilities, and which one to pick in 2026.
Llama 4 Scout outperforms in 0 benchmarks, while Mercury 2 is better at 2 benchmarks (GPQA, LiveCodeBench). Mercury 2 significantly outperforms across most benchmarks.
On price, Llama 4 Scout is roughly 2.8x cheaper per token on a blended 3:1 input/output basis, which adds up quickly at production volume.
Llama 4 Scout also accepts a larger context window (10,000,000 input tokens), making it the stronger choice for long documents and large codebases.
Choose Llama 4 Scout if…
- cost matters — it's about 2.8x cheaper per token
- you process long inputs — it offers a 10,000,000 token context window
- you need open weights you can self-host or fine-tune
Choose Mercury 2 if…
- you want the strongest raw capability — it leads on 2 of 2 shared benchmarks
- you want the most recent training data — it shipped Feb 2026
Performance Benchmarks
Comparative analysis across standard metrics
Llama 4 Scout outperforms in 0 benchmarks, while Mercury 2 is better at 2 benchmarks (GPQA, LiveCodeBench).
Mercury 2 significantly outperforms across most benchmarks.
Arena Performance
Human preference votes
Pricing Analysis
Price comparison per million tokens
For input processing, Llama 4 Scout ($0.08/1M tokens) is 3.1x cheaper than Mercury 2 ($0.25/1M tokens).
For output processing, Llama 4 Scout ($0.30/1M tokens) is 2.5x cheaper than Mercury 2 ($0.75/1M tokens).
In conclusion, Mercury 2 is more expensive than Llama 4 Scout.*
* Using a 3:1 ratio of input to output tokens
Context Window
Maximum input and output token capacity
Llama 4 Scout accepts 10,000,000 input tokens compared to Mercury 2's 128,000 tokens. Llama 4 Scout can generate longer responses up to 10,000,000 tokens, while Mercury 2 is limited to 8,192 tokens.
Input Capabilities
Supported data types and modalities
Llama 4 Scout supports multimodal inputs, whereas Mercury 2 does not.
Llama 4 Scout can handle both text and other forms of data like images, making it suitable for multimodal applications.
Llama 4 Scout
Mercury 2
License
Usage and distribution terms
Llama 4 Scout is licensed under Llama 4 Community License Agreement, while Mercury 2 uses a proprietary license.
License differences may affect how you can use these models in commercial or open-source projects.
Llama 4 Community License Agreement
Open weights
Proprietary
Closed source
Release Timeline
When each model was launched
Llama 4 Scout was released on 2025-04-05, while Mercury 2 was released on 2026-02-24.
Mercury 2 is 11 months newer than Llama 4 Scout.
Apr 5, 2025
1.2 years ago
Feb 24, 2026
3 months ago
10mo newerKnowledge Cutoff
When training data ends
Neither model specifies a knowledge cutoff date.
Unable to compare the recency of their training data.
Provider Availability
Llama 4 Scout is available from DeepInfra, Lambda, Novita, Groq, Fireworks, Together. Mercury 2 is available from Inception.
Llama 4 Scout
Mercury 2
Outputs Comparison
Key Takeaways
Mercury 2
View detailsInception
Detailed Comparison
| Feature |
|---|
FAQ
Common questions about Llama 4 Scout vs Mercury 2.