API Provider22 active models10 organizationsnovita.ai

Novita: API pricing, performance & models

Novita hosts 22 active AI models, with input pricing from $0.08 per 1M tokens, averaging 45 tok/s output throughput, with up to 1.0M context window. Compare Novita's API pricing, latency, and feature support against other LLM providers.

22Active

Pricing

$0.080/MFrom

$0.471/MAvg

Performance

45tok/sThroughput

0.95sLatency

1.0MMax

Catalog

Alibaba Cloud / Qwen Team23Moonshot AI6Google4Xiaomi4DeepSeek2MiniMax2OpenAI1Zhipu AI1

Type

Price

43 models

Model	Input /M	Output /M	Throughput	Context	Capabilities
Qwen3.7 Max	$1.25	$3.75	—	1.0M	—
Qwen3.6-27Bbfloat16	$0.600	$3.60	—	262K	Vision
Qwen3.6-27Bbfloat16	$0.600	$3.60	—	262K	—
Qwen3.5-27Bbfloat16	$0.300	$2.40	—	262K	—
Qwen3.5-27Bbfloat16	$0.300	$2.40	—	262K	Vision
Qwen3.5-27Bbfloat16	$0.300	$2.40	—	262K	—
Qwen3.5-35B-A3B	$0.250	$2.00	—	262K	—
Qwen3.5-35B-A3B	$0.250	$2.00	—	262K	Vision
Qwen3.5-35B-A3B	$0.250	$2.00	—	262K	—
Qwen3.5-122B-A10B	$0.400	$3.20	—	262K	—
Qwen3.5-122B-A10B	$0.400	$3.20	—	262K	Vision
Qwen3.5-122B-A10B	$0.400	$3.20	—	262K	—
Qwen3.5-397B-A17B	$0.600	$3.60	—	262K	—
Qwen3.5-397B-A17B	$0.600	$3.60	—	262K	Vision
Qwen3.5-397B-A17B	$0.600	$3.60	—	262K	—
Qwen3 VL 8B Instructfp8	$0.080	$0.500	—	131K	—
Qwen3 VL 8B Instructfp8	$0.080	$0.500	—	131K	Vision
Qwen3 VL 8B Instructfp8	$0.080	$0.500	—	131K	—
Qwen3 VL 235B A22B Instructbf16	$0.300	$1.50	—	131K	—
Qwen3 VL 235B A22B Instructbf16	$0.300	$1.50	—	131K	Vision
Qwen3 VL 235B A22B Instructbf16	$0.300	$1.50	—	131K	—
Qwen3 32B	$0.100	$0.440	32t/s	128K	—
Qwen3 30B A3B	$0.100	$0.440	89t/s	128K	—

At a glance

Novitapricing, performance & catalog

The citable facts about Novita's 22 models — sourced from provider APIs and refreshed continuously.

Lowest price: Qwen3 VL 8B Instruct at $0.080 per 1M input tokens
Highest throughput: Qwen3 30B A3B at 89 tokens/s
Lowest latency: Qwen3 30B A3B at 0.73s
Largest context: MiMo-V2.5-Pro at 1.0M tokens
Catalog: 22 active models from 10 organizations

Most affordable

Fastest

Largest context

FAQ

Common questions about Novita.

What is Novita?

Novita is an API provider that hosts large language models. Active models: 22; From (input): $0.08 / 1M tok; Avg throughput: 45 tok/s; Avg latency: 0.95 s; Max context: 1.0M.

How many models does Novita offer?

Novita currently serves 22 active models out of 46 historical offerings on LLM Stats.

What is Novita's API pricing?

Novita input pricing starts from $0.08 per 1M tokens, with the most expensive offering at $2 per 1M tokens. See the Pricing tab above for the full per-model breakdown.

How fast is Novita?

Novita averages 45 output tokens per second across its catalog, with average latency of 0.95s. Per-model performance is shown in the Performance tab.

Is Novita OpenAI compatible?

Most providers expose an OpenAI-compatible /v1/chat/completions endpoint so you can switch from OpenAI to Novita by changing only the base URL and API key. Check https://novita.ai/ for the exact endpoint format and any provider-specific parameters.

Does Novita support multimodal models?

Yes. Novita's catalog includes 12 vision-capable models. See the Models and Capabilities tabs for the full per-model breakdown.

Whose models does Novita host?

Novita hosts models from DeepSeek, Google, MiniMax, Moonshot AI, OpenAI, and Alibaba Cloud / Qwen Team, plus 4 more. See the Models tab for the full catalog grouped by creator.

How do I start using Novita?

Sign up at https://novita.ai/ to get an API key, then call Novita's API directly from your application. Most clients work out of the box by pointing the OpenAI SDK at Novita's base URL with your key. Use the Pricing and Performance tabs above to pick the right model for your latency, cost, and context-window requirements.

Catalog

Alibaba Cloud / Qwen Team23Moonshot AI6Google4Xiaomi4DeepSeek2MiniMax2OpenAI1Zhipu AI1

Type

Price

43 models

Model	Input /M	Output /M	Throughput	Context	Capabilities
Qwen3.7 Max	$1.25	$3.75	—	1.0M	—
Qwen3.6-27Bbfloat16	$0.600	$3.60	—	262K	Vision
Qwen3.6-27Bbfloat16	$0.600	$3.60	—	262K	—
Qwen3.5-27Bbfloat16	$0.300	$2.40	—	262K	—
Qwen3.5-27Bbfloat16	$0.300	$2.40	—	262K	Vision
Qwen3.5-27Bbfloat16	$0.300	$2.40	—	262K	—
Qwen3.5-35B-A3B	$0.250	$2.00	—	262K	—
Qwen3.5-35B-A3B	$0.250	$2.00	—	262K	Vision
Qwen3.5-35B-A3B	$0.250	$2.00	—	262K	—
Qwen3.5-122B-A10B	$0.400	$3.20	—	262K	—
Qwen3.5-122B-A10B	$0.400	$3.20	—	262K	Vision
Qwen3.5-122B-A10B	$0.400	$3.20	—	262K	—
Qwen3.5-397B-A17B	$0.600	$3.60	—	262K	—
Qwen3.5-397B-A17B	$0.600	$3.60	—	262K	Vision
Qwen3.5-397B-A17B	$0.600	$3.60	—	262K	—
Qwen3 VL 8B Instructfp8	$0.080	$0.500	—	131K	—
Qwen3 VL 8B Instructfp8	$0.080	$0.500	—	131K	Vision
Qwen3 VL 8B Instructfp8	$0.080	$0.500	—	131K	—
Qwen3 VL 235B A22B Instructbf16	$0.300	$1.50	—	131K	—
Qwen3 VL 235B A22B Instructbf16	$0.300	$1.50	—	131K	Vision
Qwen3 VL 235B A22B Instructbf16	$0.300	$1.50	—	131K	—
Qwen3 32B	$0.100	$0.440	32t/s	128K	—
Qwen3 30B A3B	$0.100	$0.440	89t/s	128K	—