Model Comparison

GPT-5.4 vs GPT-5.3 Codex

GPT-5.4 shows notably better performance in the majority of benchmarks. GPT-5.3 Codex is 1.2x cheaper per token.

Performance Benchmarks

Comparative analysis across standard metrics

3 benchmarks

GPT-5.4 outperforms in 2 benchmarks (OSWorld-Verified, SWE-Bench Pro), while GPT-5.3 Codex is better at 1 benchmark (Terminal-Bench 2.0).

GPT-5.4 shows notably better performance in the majority of benchmarks.

Sat Apr 18 2026 • llm-stats.com

Arena Performance

Human preference votes

Pricing Analysis

Price comparison per million tokens

GPT-5.3 Codex costs less

For input processing, GPT-5.4 ($2.50/1M tokens) is 1.4x more expensive than GPT-5.3 Codex ($1.75/1M tokens).

For output processing, GPT-5.4 ($15.00/1M tokens) is 1.1x more expensive than GPT-5.3 Codex ($14.00/1M tokens).

In conclusion, GPT-5.4 is more expensive than GPT-5.3 Codex.*

* Using a 3:1 ratio of input to output tokens

Lowest available price from all providers
Sat Apr 18 2026 • llm-stats.com
OpenAI
GPT-5.4
Input tokens$2.50
Output tokens$15.00
Best providerOpenAI
OpenAI
GPT-5.3 Codex
Input tokens$1.75
Output tokens$14.00
Best providerOpenAI
Notice missing or incorrect data?Start an Issue

Context Window

Maximum input and output token capacity

GPT-5.4 accepts 1,000,000 input tokens compared to GPT-5.3 Codex's 400,000 tokens. Both models can generate responses up to 128,000 tokens.

OpenAI
GPT-5.4
Input1,000,000 tokens
Output128,000 tokens
OpenAI
GPT-5.3 Codex
Input400,000 tokens
Output128,000 tokens
Sat Apr 18 2026 • llm-stats.com

Input Capabilities

Supported data types and modalities

Both GPT-5.4 and GPT-5.3 Codex support multimodal inputs.

They are both capable of processing various types of data, offering versatility in application.

GPT-5.4

Text
Images
Audio
Video

GPT-5.3 Codex

Text
Images
Audio
Video

License

Usage and distribution terms

Both models are licensed under proprietary licenses.

Both models have usage restrictions defined by their respective organizations.

GPT-5.4

Proprietary

Closed source

GPT-5.3 Codex

Proprietary

Closed source

Release Timeline

When each model was launched

GPT-5.4 was released on 2026-03-05, while GPT-5.3 Codex was released on 2026-02-05.

GPT-5.4 is 1 month newer than GPT-5.3 Codex.

GPT-5.4

Mar 5, 2026

1 months ago

4w newer
GPT-5.3 Codex

Feb 5, 2026

2 months ago

Knowledge Cutoff

When training data ends

Neither model specifies a knowledge cutoff date.

Unable to compare the recency of their training data.

No cutoff dates available

Provider Availability

GPT-5.4 is available from OpenAI. GPT-5.3 Codex is available from OpenAI.

GPT-5.4

openai logo
OpenAI
Input Price:Input: $2.50/1MOutput Price:Output: $15.00/1M

GPT-5.3 Codex

openai logo
OpenAI
Input Price:Input: $1.75/1MOutput Price:Output: $14.00/1M
* Prices shown are per million tokens

Outputs Comparison

Notice missing or incorrect data?Start an Issue discussion

Key Takeaways

Larger context window (1,000,000 tokens)
Higher OSWorld-Verified score (75.0% vs 64.7%)
Higher SWE-Bench Pro score (57.7% vs 56.8%)
Less expensive input tokens
Less expensive output tokens
Higher Terminal-Bench 2.0 score (77.3% vs 75.1%)

Detailed Comparison

AI Model Comparison Table
Feature
OpenAI
GPT-5.4
OpenAI
GPT-5.3 Codex

FAQ

Common questions about GPT-5.4 vs GPT-5.3 Codex

GPT-5.4 shows notably better performance in the majority of benchmarks. GPT-5.4 is made by OpenAI and GPT-5.3 Codex is made by OpenAI. The best choice depends on your use case — compare their benchmark scores, pricing, and capabilities above.
GPT-5.4 scores Tau2 Telecom: 98.9%, ARC-AGI: 93.7%, Graphwalks BFS <128k: 93.0%, GPQA: 92.8%, Graphwalks parents <128k: 89.8%. GPT-5.3 Codex scores SWE-Lancer (IC-Diamond subset): 81.4%, Cybersecurity CTFs: 77.6%, Terminal-Bench 2.0: 77.3%, OSWorld-Verified: 64.7%, SWE-Bench Pro: 56.8%.
GPT-5.3 Codex is 1.4x cheaper for input tokens. GPT-5.4 costs $2.50/M input and $15.00/M output via openai. GPT-5.3 Codex costs $1.75/M input and $14.00/M output via openai.
GPT-5.4 supports 1.0M tokens and GPT-5.3 Codex supports 400K tokens. A larger context window lets you process longer documents, conversations, or codebases in a single request.
Key differences include context window (1.0M vs 400K), input pricing ($2.50 vs $1.75/M). See the full comparison above for benchmark-by-benchmark results.