Qwen2.5 VL 7B Instruct
Overview
Overview
Qwen2.5-VL is a vision-language model from the Qwen family. Key enhancements include visual understanding (objects, text, charts, layouts), visual agent capabilities (tool use, computer/phone control), long video comprehension with event pinpointing, visual localization (bounding boxes/points), and structured output generation.
Qwen2.5 VL 7B Instruct was released on January 26, 2025.
Performance
Timeline
ReleasedUnknown
Knowledge CutoffUnknown
Specifications
Parameters
8.3B
License
Apache 2.0
Training Data
Unknown
Tags
tuning:instruct
Benchmarks
Benchmarks
Qwen2.5 VL 7B Instruct Performance Across Datasets
Scores sourced from the model's scorecard, paper, or official blog posts
Notice missing or incorrect data?Start an Issue discussion→
Pricing
Pricing
Pricing, performance, and capabilities for Qwen2.5 VL 7B Instruct across different providers:
No pricing information available for this model.
API Access
API Access Coming Soon
API access for Qwen2.5 VL 7B Instruct will be available soon through our gateway.
Recent Posts
Recent Reviews
FAQ
Common questions about Qwen2.5 VL 7B Instruct
Qwen2.5 VL 7B Instruct was released on January 26, 2025 by Qwen.
Qwen2.5 VL 7B Instruct was created by Qwen.
Qwen2.5 VL 7B Instruct has 8.3 billion parameters.
Qwen2.5 VL 7B Instruct is released under the Apache 2.0 license. This is an open-source/open-weight license.
Yes, Qwen2.5 VL 7B Instruct is a multimodal model that can process both text and images as input.