Question 1

What is Google?

Accepted Answer

Google is an API provider that hosts large language models. Active models: 17; From (input): $0.25 / 1M tok; Avg throughput: 78 tok/s; Avg latency: 0.60 s; Max context: 1.0M.

Question 2

How many models does Google offer?

Accepted Answer

Google currently serves 17 active models out of 40 historical offerings on LLM Stats.

Question 3

What is Google's API pricing?

Accepted Answer

Google input pricing starts from $0.25 per 1M tokens, with the most expensive offering at $15 per 1M tokens. See the Pricing tab above for the full per-model breakdown.

Question 4

How fast is Google?

Accepted Answer

Google averages 78 output tokens per second across its catalog, with average latency of 0.60s. Per-model performance is shown in the Performance tab.

Question 5

Does Google support multimodal models?

Accepted Answer

Yes. Google's catalog includes 20 vision-capable, 13 image generation, and 10 video models. See the Models and Capabilities tabs for the full per-model breakdown.

Question 6

Whose models does Google host?

Accepted Answer

Google hosts models from Anthropic, Google, AI21 Labs, Meta, and Mistral AI. See the Models tab for the full catalog grouped by creator.

Question 7

How do I start using Google?

Accepted Answer

Sign up at https://ai.google.dev to get an API key, then call Google's API directly from your application. Use the Pricing and Performance tabs above to pick the right model for your latency, cost, and context-window requirements.