QwenReleased on Sep 22, 2025

Qwen3 VL 4B Instruct: Benchmarks, Pricing & Context Window

Qwen3 VL 4B Instruct is a language model from Qwen, released in September 2025, with multimodal input.

Qwen3-VL is a large multimodal model that unifies vision, language, and reasoning to achieve human-level perception and cognition across text, images, and video. Built on a 235B-parameter architecture, it integrates early joint training of

Input
TextImage
Output
Text

Qwen3 VL 4B Instruct pricing

Providers

Qwen3 VL 4B Instruct starts at $0.100 per million input tokens and $0.600 per million output tokens via DeepInfra.

ProviderInput $/MOutput $/MMax InputMax OutputLatency sThroughputQuantInputOutput
DeepInfra logoDeepInfra
$0.100$0.600262.1K262.1K
0.51
154 c/s
fp8

Qwen3 VL 4B Instruct API

API access coming soon

Qwen3 VL 4B Instruct will be available through our gateway shortly.

Qwen3 VL 4B Instruct examples

Recent arena outputs from Qwen3 VL 4B Instruct, picked from the highest-ranked matchups.

Qwen3 VL 4B Instruct license

Qwen3 VL 4B Instruct is released under the Apache 2.0 license, which permits commercial use, has 4.0B parameters.

License
Apache 2.0
Commercial use allowed
Parameters
4.0B

Apache License 2.0 - allows commercial use

FAQ

Common questions about Qwen3 VL 4B Instruct.

What is the Qwen3 VL 4B Instruct release date?

Qwen3 VL 4B Instruct was released on September 22, 2025 by Qwen.

Who created Qwen3 VL 4B Instruct?

Qwen3 VL 4B Instruct was created by Qwen.

How many parameters does Qwen3 VL 4B Instruct have?

Qwen3 VL 4B Instruct has 4.0 billion parameters.

What is the license for Qwen3 VL 4B Instruct?

Qwen3 VL 4B Instruct is released under the Apache 2.0 license. This is an open-source/open-weight license.

Is Qwen3 VL 4B Instruct multimodal?

Yes, Qwen3 VL 4B Instruct is a multimodal model that can process both text and images as input.