At a glance

DeepSeekpricing, performance & catalog

The citable facts about DeepSeek's 4 models — sourced from provider APIs and refreshed continuously.

Lowest price
DeepSeek-V4-Flash-Max at $0.140 per 1M input tokens
Highest throughput
DeepSeek-V3.2 (Non-thinking) at 100 tokens/s
Lowest latency
DeepSeek-V3.2 (Non-thinking) at 0.30s
Largest context
DeepSeek-V4-Flash-Max at 1.0M tokens
Catalog
4 active models from 1 organization

FAQ

Common questions about DeepSeek.

What is DeepSeek?

DeepSeek is an API provider that hosts large language models. Active models: 4; From (input): $0.14 / 1M tok; Avg throughput: 55 tok/s; Avg latency: 0.30 s; Max context: 1.0M.

How many models does DeepSeek offer?

DeepSeek currently serves 4 active models out of 9 historical offerings on LLM Stats.

What is DeepSeek's API pricing?

DeepSeek input pricing starts from $0.14 per 1M tokens, with the most expensive offering at $1.74 per 1M tokens. See the Pricing tab above for the full per-model breakdown.

How fast is DeepSeek?

DeepSeek averages 55 output tokens per second across its catalog, with average latency of 0.30s. Per-model performance is shown in the Performance tab.

Is DeepSeek OpenAI compatible?

Most providers expose an OpenAI-compatible /v1/chat/completions endpoint so you can switch from OpenAI to DeepSeek by changing only the base URL and API key. Check https://deepseek.com/ for the exact endpoint format and any provider-specific parameters.

Whose models does DeepSeek host?

DeepSeek hosts models from DeepSeek. See the Models tab for the full catalog grouped by creator.

How do I start using DeepSeek?

Sign up at https://deepseek.com/ to get an API key, then call DeepSeek's API directly from your application. Most clients work out of the box by pointing the OpenAI SDK at DeepSeek's base URL with your key. Use the Pricing and Performance tabs above to pick the right model for your latency, cost, and context-window requirements.