- Organizations
- Qwen
- Qwen3 VL 4B Instruct
Qwen3 VL 4B Instruct: Benchmarks, Pricing & Context Window
Qwen3 VL 4B Instruct is a language model from Qwen, released in September 2025, with multimodal input.
Qwen3-VL is a large multimodal model that unifies vision, language, and reasoning to achieve human-level perception and cognition across text, images, and video. Built on a 235B-parameter architecture, it integrates early joint training of
Qwen3 VL 4B Instruct pricing
Providers
Qwen3 VL 4B Instruct starts at $0.100 per million input tokens and $0.600 per million output tokens via DeepInfra.
| Provider | Input $/M | Output $/M | Max Input | Max Output | Latency s | Throughput | Quant | Input | Output |
|---|---|---|---|---|---|---|---|---|---|
| $0.100 | $0.600 | 262.1K | 262.1K | 0.51 | 154 c/s | fp8 |
Qwen3 VL 4B Instruct API
API access coming soon
Qwen3 VL 4B Instruct will be available through our gateway shortly.
Qwen3 VL 4B Instruct examples
Recent arena outputs from Qwen3 VL 4B Instruct, picked from the highest-ranked matchups.
Qwen3 VL 4B Instruct license
Qwen3 VL 4B Instruct is released under the Apache 2.0 license, which permits commercial use, has 4.0B parameters.
- License
- Apache 2.0
- Commercial use allowed
- Parameters
- 4.0B
Apache License 2.0 - allows commercial use
FAQ
Common questions about Qwen3 VL 4B Instruct.