Back to blog
Comparison·Technical Deep Dive

Claude Fable 5 vs Claude Opus 4.8: Complete Comparison

Claude Fable 5 vs Opus 4.8 on benchmarks, pricing, and safeguards. Fable 5 leads on raw capability, but falls back to Opus 4.8 in guarded domains.

Jonathan Chavez
Jonathan Chavez
Co-Founder @ LLM Stats
·11 min read
Claude Fable 5 vs Claude Opus 4.8: Complete Comparison

Head to Head

Anthropic · June 2026

Opus 4.8
General workhorse
$5 / $25 · per 1M tokens
Fable 5
Mythos-class, above Opus
$10 / $50 · per 1M tokens

A tier apart,
until safeguards collapse them.

Fable 5 leads every unsafeguarded benchmark and costs exactly 2x. But on cybersecurity, biology, and chemistry it falls back to Opus 4.8, so in those domains you are paying double for the same answer.

On June 9, 2026, Anthropic shipped Claude Fable 5, a Mythos-class model a tier above Claude Opus 4.8. The two are from the same family, take the same inputs, and share the same 1M-token context. The question is not which is smarter, Fable 5 clearly is, but whether its capability premium is worth double the price for your workload. And there is a twist: in a set of guarded domains, Fable 5 does not just resemble Opus 4.8, it becomes it.


The Verdict

On every unsafeguarded benchmark both models report, Fable 5 leads Opus 4.8, often by a wide margin: +6.4 on SWE-bench Verified (95.0% vs 88.6%), +10.8 on SWE-bench Pro (80.0% vs 69.2%), and roughly 2x the score on FrontierCode(29.3% vs 13.4% on the Diamond subset). On Artificial Analysis's GDPval-AA knowledge-work Elo, Fable 5 is ahead at 1932 vs 1890. This is a real tier jump, not a point-release.

Pricing is the clean part. Fable 5 is $10 / $50 per million input / output tokens, exactly 2x Opus 4.8's $5 / $25. Both share a 1M / 128K context window and text-plus-vision input, so the only commercial variable is the per-token rate.

The twist that decides most real deployments: Fable 5's safeguards fall back to Opus 4.8 on cybersecurity, biology, chemistry, and distillation requests. In those domains, calling Fable 5 gets you an Opus 4.8 answer at 2x the price. The right model is not a single winner. It is Fable 5 for frontier coding and knowledge work, and Opus 4.8 for routine traffic and anything that trips a safeguard.


Same Family, Two Tiers

Opus 4.8 is Anthropic's general workhorse: a strong, well-priced frontier model that shipped on May 28, 2026. Fable 5 is something different. It is the generally available, production-safeguarded deployment of the same weights as the restricted Claude Mythos 5, which Anthropic positions as a capability tier above Opus entirely.

So this comparison is not two sibling point-releases. It is a workhorse model against a deliberately gated frontier model, where the frontier model degrades to the workhorse precisely in the domains Anthropic considers sensitive. That relationship, not any single benchmark, is what makes the choice interesting.


Side-by-Side at a Glance

The commercial surface is nearly identical. The model tier and the per-token price are the only spec-sheet differences that matter.

SpecClaude Fable 5Claude Opus 4.8
TierMythos-class (above Opus)Opus / general
Release dateJun 9, 2026May 28, 2026
Model IDclaude-fable-5claude-opus-4-8
Input / output price$10 / $50 per 1M$5 / $25 per 1M
Context window (in / out)1M / 128K1M / 128K
ModalitiesText + image, text outText + image, text out
Extended thinkingYesYes
Safeguard fallbackFalls back to Opus 4.8n/a (is the fallback)
AvailabilityGenerally availableGenerally available

Benchmark Head-to-Head

All scores are self-reported by Anthropic in the Fable 5 launch table and the system card. The chart below shows only unsafeguarded benchmarks, where Fable 5's published number is genuinely Fable 5 and not an Opus 4.8 fallback. This is the capability premium you pay 2x for.

The Capability Premium

Opus 4.8Fable 5

Fable 5 leads every one,
where safeguards stay quiet.

SWE-bench Verified
88.695.0+6.4
OSWorld-Verified
83.485.0+1.6
SWE-bench Pro
69.280.0+10.8
Blueprint-Bench 2
14.538.6+24.1
GDP.pdf
22.529.8+7.3
FrontierCode (Diamond)
13.429.3+15.9
AutomationBench
12.917.4+4.5
Legal Agent
10.413.3+2.9
Self-reported by Anthropic. Unsafeguarded benchmarks only. Scores on a 0–100 scale. GDPval-AA (Elo) shown separately in the tables.

Software engineering & agentic work

BenchmarkFable 5Opus 4.8Lead
SWE-bench Verified95.0%88.6%Fable +6.4
SWE-bench Pro80.0%69.2%Fable +10.8
FrontierCode (Diamond)29.3%13.4%Fable +15.9
OSWorld-Verified85.0%83.4%Fable +1.6

The FrontierCode gap is the headline. On the hard, less-saturated Diamond subset Fable 5 more than doubles Opus 4.8 and takes first place outright. SWE-bench Verified is near its ceiling, so the +10.8 on SWE-bench Pro is the more meaningful coding signal. OSWorld-Verified is close, both models are strong on computer use.

Knowledge work

BenchmarkFable 5Opus 4.8Lead
GDPval-AA Elo19321890Fable +42
GDP.pdf29.8%22.5%Fable +7.3
Blueprint-Bench 238.6%14.5%Fable +24.1
AutomationBench17.4%12.9%Fable +4.5
Legal Agent Benchmark13.3%10.4%Fable +2.9

Fable 5 leads every knowledge-work benchmark too. The absolute numbers on AutomationBench and the Legal Agent Benchmark are low across the board because these are deliberately hard, long-horizon business workflows where every frontier model still has room to grow. The point is the direction: Fable 5 is ahead on all of them.


Where Fable 5 Becomes Opus 4.8

Here is the part that does not show up in a normal spec sheet. Fable 5 runs classifiers over every request. When one fires on cybersecurity, biology or chemistry, or a distillation attempt, the model does not answer with Mythos-class capability. It routes to Opus 4.8.

The Convergence

effective capability

In a guarded domain,
Fable 5 is Opus 4.8.

Fable 5
Opus 4.8
same model,
same answer

Unguarded request

Guarded request

Schematic. On cyber, bio/chem, and distillation requests, Fable 5 routes to Opus 4.8. Example: ExploitBench drops from a reported 78.0% (Mythos 5) to roughly Opus 4.8's 40.0%.

The clearest measured example is Terminal-Bench, where 20.9% of trials hit a safety refusaland reverted to Opus 4.8 for the rest of the run, pulling Fable 5's effective score down toward Opus territory. On the cyber and bio evaluations the launch table quotes the Mythos 5 numbers (78.0% on ExploitBench, 46.1% on BioMysteryBench hard); the model you actually call through the API performs much closer to Opus 4.8 on those same tasks.

The behavior also depends on the surface. In Claude client apps, a flagged request transparently falls back to Opus 4.8 and the user still gets an answer. On the Messages API, the default is to block: developers have to opt into the fallback, or the request is refused. Either way, in a guarded domain the best case is an Opus 4.8 answer, so paying the Fable 5 premium there buys you nothing.


Pricing: 2x for the Top Tier

The pricing comparison is unusually clean. Fable 5 is exactly double Opus 4.8 on both input and output. There is no long-prompt surcharge difference, no separate fast-mode tier in the comparison, just a flat 2x.

TierClaude Fable 5Claude Opus 4.8
Input$10.00 / 1M$5.00 / 1M
Output$50.00 / 1M$25.00 / 1M
Premium2xbaseline

Output tokens dominate frontier-model spend, and both models lean on extended thinking to reach their top scores, so the real per-task gap tracks output volume. Measure token-per-task on your own traffic before extrapolating: a workload where Fable 5 solves in one pass what Opus 4.8 needs two attempts for can erase part of the 2x on the bill, while a workload that trips safeguards pays the premium for an Opus answer.


Which Model for Which Workload

The comparison resolves into one decision per workload. Below is the matrix I use when pointing a product surface at one or the other.

Decision Matrix

one pick per workload

Pay for the frontier,
default to the workhorse.

01
Frontier coding
Large-scale SWE, hard algorithms, FrontierCode
Fable 5
+6 to +16 points where it counts
02
Long-horizon agents
Multi-step knowledge work, automation
Fable 5
Leads GDPval-AA, AutomationBench
03
Routine & high volume
Summaries, extraction, everyday calls
Opus 4.8
Opus clears the bar at half the price
04
Guarded domains
Security, biology, chemistry
Opus 4.8
Fable falls back to Opus here anyway
Fable 5 at $10 / $50, Opus 4.8 at $5 / $25 per 1M tokens. Route mixed traffic rather than committing all of it to one model.

A few cross-cutting rules of thumb on top of the matrix:

  • Frontier coding and agentic work go to Fable 5. Large-scale software engineering, hard algorithmic tasks (FrontierCode), and long-horizon agents are where the +6 to +16 point gaps actually change outcomes.
  • Routine and high-volume traffic stays on Opus 4.8. If Opus 4.8 already clears your quality bar, 2x for a marginal gain is hard to justify. Reserve Fable 5 for the jobs at the edge of what Opus can do.
  • Guarded-domain work stays on Opus 4.8. Security, biology, and chemistry requests fall back to Opus 4.8 on Fable 5 anyway. Call Opus 4.8 directly and skip the premium and the refusal risk.
  • Mixed workloads should route. Send the frontier slice to Fable 5 and everything else to Opus 4.8 rather than putting all traffic on one model.

For full structured benchmark data, see the model pages for Claude Fable 5 and Claude Opus 4.8, our Claude Fable 5 review, and Anthropic's launch post and system card.

Questions

Frequently Asked Questions

  • On raw capability, yes. Claude Fable 5 is a tier above: 95.0% vs 88.6% on SWE-bench Verified, 80.0% vs 69.2% on SWE-bench Pro, and 1932 vs 1890 Elo on GDPval-AA. But in guarded domains (cyber, bio, chem) Fable 5 falls back to Opus 4.8, so there the two are identical.
  • Claude Fable 5 is $10 / $50 per million input / output tokens. Claude Opus 4.8 is $5 / $25. Fable 5 is exactly 2x the per-token price across both input and output.
  • Use Opus 4.8 for routine and high-volume traffic where its capability is already enough, and for any workload in a guarded domain (security, biology, chemistry), where Fable 5 falls back to Opus 4.8 anyway. Paying 2x for an identical answer makes no sense.
  • Yes. Both support a 1 million token input context window and up to 128K output tokens, and both take text and image input with text output.
  • Fable 5 runs the same weights as the restricted Claude Mythos 5. To make it safe for general use, Anthropic adds classifiers for cybersecurity, biology and chemistry, and distillation. When one fires, the request is routed to Claude Opus 4.8 instead of being served at full Mythos-class capability.

Continue Reading