StepFunReleased on Jan 15, 2026

Step3-VL-10B: Benchmarks, Pricing & Context Window

Step3-VL-10B is a language model from StepFun, released in January 2026, with multimodal input.

STEP3-VL-10B is a lightweight open-source foundation model designed to redefine the trade-off between compact efficiency and frontier-level multimodal intelligence. Built on a unified, fully unfrozen pre-training strategy on 1.2T

Step3-VL-10B API

API access coming soon

Step3-VL-10B will be available through our gateway shortly.

Step3-VL-10B examples

Recent arena outputs from Step3-VL-10B, picked from the highest-ranked matchups.

Step3-VL-10B license

Step3-VL-10B is released under the Apache 2.0 license, which permits commercial use, has 10.0B parameters.

License
Apache 2.0
Commercial use allowed
Parameters
10.0B

Apache License 2.0 - allows commercial use

FAQ

Common questions about Step3-VL-10B.

What is the Step3-VL-10B release date?

Step3-VL-10B was released on January 15, 2026 by StepFun.

Who created Step3-VL-10B?

Step3-VL-10B was created by StepFun.

How many parameters does Step3-VL-10B have?

Step3-VL-10B has 10.0 billion parameters.

What is the license for Step3-VL-10B?

Step3-VL-10B is released under the Apache 2.0 license. This is an open-source/open-weight license.

Is Step3-VL-10B multimodal?

Yes, Step3-VL-10B is a multimodal model that can process both text and images as input.