When was GPT-4o released?

GPT-4o was released on May 13, 2024 by OpenAI. This is the official GPT-4o release date tracked on LLM Stats.

How much does GPT-4o cost?

GPT-4o costs $2.50 per million input tokens and $10.00 per million output tokens through the LLM Stats API, which works with any OpenAI-compatible SDK. Across tracked providers, the lowest price is $2.50 per million input tokens via Azure.

Is GPT-4o available via API?

Yes. GPT-4o is available through the LLM Stats API and works with any OpenAI-compatible SDK — point your client at the gateway base URL and pass the model name. It is served by 2 providers tracked on LLM Stats.

What is the license for GPT-4o?

GPT-4o is released under the Proprietary license.

Is GPT-4o multimodal?

Yes, GPT-4o is multimodal and can accept both text and images as input.

What is GPT-4o latency?

GPT-4o p95 time to first token is 1.42 seconds via OpenAI over the trailing 7 days. Lower time to first token means the model begins responding sooner for chat, agents and API workloads.

Where can I use GPT-4o?

GPT-4o is available through 2 providers including Azure, OpenAI.

Where is the GPT-4o paper or technical report?

GPT-4o has a paper or technical report available at https://openai.com/index/hello-gpt-4o/. Use that source for architecture, training, release and evaluation details.

What models should I compare GPT-4o against?

Common GPT-4o comparisons include GPT-4o vs Llama 3.3 70B Instruct, GPT-4o vs Llama 3.1 405B Instruct, GPT-4o vs Grok-2. Compare them side by side for benchmark scores, pricing, context window, latency and API availability.

GPT-4o API Pricing, Context Window & Benchmarks

Name: GPT-4o
Author: OpenAI

GPT-4o: API Pricing, Context Window & Benchmarks

GPT-4o is a language model from OpenAI, released in May 2024, with multimodal input, a 128K-token context window, and pricing from $2.50/M input and $10.00/M output.

GPT-4o ('o' for 'omni') is a multimodal AI model that accepts text, audio, image, and video inputs, and generates text, audio, and image outputs. It matches GPT-4 Turbo performance on text and code, with improvements in non-English

Input

TextImage

Output

Text

GPT-4o benchmarks

Rankings

Quality Tracker

GPT-4o Performance Across Datasets

Scores sourced from the model's scorecard, paper, or official blog posts

llm-stats.com - Mon Aug 03 2026

Notice missing or incorrect data?

GPT-4o pricing

Providers

GPT-4o starts at $2.50 per million input tokens and $10.00 per million output tokens via Azure. See all 2 providers below with their per-token pricing, latency, throughput, and modality support.

Provider	Input $/M	Output $/M	Context in / out	TTFT p50 / p95 s	Output avg / p5 c/s	Success 7d	Modalities in / out
Azure	$2.50	$10.00	128.0K/4.1K	—/0.54	92/—	—	/
OpenAI	$2.50	$10.00	128.0K/4.1K	0.47/1.42	714/433	100.00%(12)	/

Cached input is the discounted price for prompt tokens served from a provider cache. TTFT is time to first token. Output is characters per second; p5 is the sustained floor exceeded by 95% of observed requests. Success is calculated from completed versus failed requests over the trailing seven days.

Loading chart...

GPT-4o context window

Input and output token limits for GPT-4o, plus how it ranks on long-context understanding.

InputOutput

128Ktokens

4Ktokens

≈ 192 pages of text

128K

8K128K1M

GPT-4o API

POST/v1/chat/completions

Modelgpt-4o-2024-05-13

API key●

Prompt●

Stream

Run a request to see the response

Use it in your code

Billed at $2.50 input / $10.00 output per 1M tokens through the LLM Stats gateway.

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://gateway.llm-stats.com/v1"
)

response = client.chat.completions.create(
    model="gpt-4o-2024-05-13",
    messages=[
        {"role": "user", "content": "What is machine learning?"}
    ]
)

print(response.choices[0].message.content)

Need an API key? Create one above in the playground, or read the API documentation.

GPT-4o latency

GPT-4o time to first token, sustained output throughput, and failed-request rate from live API traffic over the trailing 7 days.

Provider operational metrics

Time to first token, output throughput, and failed-request rate from live API traffic

Loading chart...

GPT-4o examples

Recent arena outputs from GPT-4o, picked from the highest-ranked matchups.

GPT-4o license

GPT-4o is a proprietary model available under its provider's product and API terms.

License: Proprietary; Hosted access

Proprietary license - usage restrictions apply

GPT-4o resources

Official sources for GPT-4o: api documentation, official playground, official launch post.

GPT-4o vs other models

The most-compared alternatives to GPT-4o are Llama 3.3 70B Instruct, Llama 3.1 405B Instruct, Grok-2. Open any pair side-by-side for benchmarks, pricing, context, and latency.