Question 1

What is Azure?

Accepted Answer

Azure is an API provider that hosts large language models. Active models: 4; From (input): $0.50 / 1M tok; Avg throughput: 95 tok/s; Avg latency: 0.59 s; Max context: 128K.

Question 2

How many models does Azure offer?

Accepted Answer

Azure currently serves 4 active models out of 11 historical offerings on LLM Stats.

Question 3

What is Azure's API pricing?

Accepted Answer

Azure input pricing starts from $0.50 per 1M tokens, with the most expensive offering at $10 per 1M tokens. See the Pricing tab above for the full per-model breakdown.

Question 4

How fast is Azure?

Accepted Answer

Azure averages 95 output tokens per second across its catalog, with average latency of 0.59s. Per-model performance is shown in the Performance tab.

Question 5

Is Azure OpenAI compatible?

Accepted Answer

Most providers expose an OpenAI-compatible /v1/chat/completions endpoint so you can switch from OpenAI to Azure by changing only the base URL and API key. Check https://azure.microsoft.com for the exact endpoint format and any provider-specific parameters.

Question 6

Does Azure support multimodal models?

Accepted Answer

Yes. Azure's catalog includes 2 vision-capable models. See the Models and Capabilities tabs for the full per-model breakdown.

Question 7

Whose models does Azure host?

Accepted Answer

Azure hosts models from OpenAI and Microsoft. See the Models tab for the full catalog grouped by creator.

Question 8

How do I start using Azure?

Accepted Answer

Sign up at https://azure.microsoft.com to get an API key, then call Azure's API directly from your application. Most clients work out of the box by pointing the OpenAI SDK at Azure's base URL with your key. Use the Pricing and Performance tabs above to pick the right model for your latency, cost, and context-window requirements.

Model	Input /M	Output /M	Throughput	Context	Capabilities
GPT-4 Turbo	$10.0	$30.0	97t/s	128K	—
GPT-4o	$2.50	$10.0	99t/s	128K	Vision
GPT-4o	$2.50	$10.0	99t/s	128K	—
GPT-4o	$2.50	$10.0	92t/s	128K	Vision
GPT-4o	$2.50	$10.0	92t/s	128K	—
GPT-3.5 Turbo	$0.500	$1.50	90t/s	16K	—

Model	Input /M	Output /M	Throughput	Context	Capabilities
GPT-4 Turbo	$10.0	$30.0	97t/s	128K	—
GPT-4o	$2.50	$10.0	99t/s	128K	Vision
GPT-4o	$2.50	$10.0	99t/s	128K	—
GPT-4o	$2.50	$10.0	92t/s	128K	Vision
GPT-4o	$2.50	$10.0	92t/s	128K	—
GPT-3.5 Turbo	$0.500	$1.50	90t/s	16K	—

Azure: API pricing, performance & models

Catalog

Azurepricing, performance & catalog

Most affordable

Fastest

Largest context

FAQ