API Provider1 active models1 organizationcloud.google.com/vertex-ai

Vertex AI: API pricing, performance & models

Vertex AI hosts 1 active AI models, with input pricing from $5.00 per 1M tokens, averaging 42 tok/s output throughput, with up to 1.0M context window. Compare Vertex AI's API pricing, latency, and feature support against other LLM providers.

First-party only
1Active
Pricing
$5.00/MFrom
$5.00/MAvg
Performance
42tok/sThroughput
0.50sLatency
1.0MMax

Catalog

Type
Price
2 models
Model
Claude Opus 4.7
Claude Opus 4.7

FAQ

Common questions about Vertex AI.

What is Vertex AI?

Vertex AI is an API provider that hosts large language models. Active models: 1; From (input): $5.00 / 1M tok; Avg throughput: 42 tok/s; Avg latency: 0.50 s; Max context: 1.0M.

How many models does Vertex AI offer?

Vertex AI currently serves 1 active models out of 1 historical offerings on LLM Stats.

What is Vertex AI's API pricing?

Vertex AI input pricing starts from $5.00 per 1M tokens, with the most expensive offering at $5 per 1M tokens. See the Pricing tab above for the full per-model breakdown.

How fast is Vertex AI?

Vertex AI averages 42 output tokens per second across its catalog, with average latency of 0.50s. Per-model performance is shown in the Performance tab.

Does Vertex AI support multimodal models?

Yes. Vertex AI's catalog includes 1 vision-capable models. See the Models and Capabilities tabs for the full per-model breakdown.

Whose models does Vertex AI host?

Vertex AI hosts models from Anthropic. See the Models tab for the full catalog grouped by creator.

How do I start using Vertex AI?

Sign up at https://cloud.google.com/vertex-ai to get an API key, then call Vertex AI's API directly from your application. Use the Pricing and Performance tabs above to pick the right model for your latency, cost, and context-window requirements.