API Provider4 active models2 organizationsazure.microsoft.com

Azure: API pricing, performance & models

Azure hosts 4 active AI models, with input pricing from $0.50 per 1M tokens, averaging 95 tok/s output throughput, with up to 128K context window. Compare Azure's API pricing, latency, and feature support against other LLM providers.

4Active
Pricing
$0.500/MFrom
$3.42/MAvg
Performance
95tok/sThroughput
0.59sLatency
128KMax

Catalog

Type
Price
6 models
Model
GPT-4 Turbo
GPT-4o
GPT-4o
GPT-4o
GPT-4o
GPT-3.5 Turbo

FAQ

Common questions about Azure.

What is Azure?

Azure is an API provider that hosts large language models. Active models: 4; From (input): $0.50 / 1M tok; Avg throughput: 95 tok/s; Avg latency: 0.59 s; Max context: 128K.

How many models does Azure offer?

Azure currently serves 4 active models out of 11 historical offerings on LLM Stats.

What is Azure's API pricing?

Azure input pricing starts from $0.50 per 1M tokens, with the most expensive offering at $10 per 1M tokens. See the Pricing tab above for the full per-model breakdown.

How fast is Azure?

Azure averages 95 output tokens per second across its catalog, with average latency of 0.59s. Per-model performance is shown in the Performance tab.

Does Azure support multimodal models?

Yes. Azure's catalog includes 2 vision-capable models. See the Models and Capabilities tabs for the full per-model breakdown.

Whose models does Azure host?

Azure hosts models from OpenAI and Microsoft. See the Models tab for the full catalog grouped by creator.

How do I start using Azure?

Sign up at https://azure.microsoft.com to get an API key, then call Azure's API directly from your application. Use the Pricing and Performance tabs above to pick the right model for your latency, cost, and context-window requirements.