Qwen2-VL-72B-Instruct
Overview
Overview
An instruction-tuned, large multimodal model that excels at visual understanding and step-by-step reasoning. It supports image and video input, with dynamic resolution handling and improved positional embeddings (M-ROPE), enabling advanced capabilities such as complex problem solving, multilingual text recognition in images, and agent-like interactions in video contexts.
Qwen2-VL-72B-Instruct was released on August 29, 2024.
Performance
Timeline
ReleasedUnknown
Knowledge CutoffUnknown
Specifications
Parameters
73.4B
License
tongyi-qianwen
Training Data
Unknown
Tags
tuning:instruct
Benchmarks
Benchmarks
Qwen2-VL-72B-Instruct Performance Across Datasets
Scores sourced from the model's scorecard, paper, or official blog posts
Notice missing or incorrect data?Start an Issue discussion→
Pricing
Pricing
Pricing, performance, and capabilities for Qwen2-VL-72B-Instruct across different providers:
No pricing information available for this model.
API Access
API Access Coming Soon
API access for Qwen2-VL-72B-Instruct will be available soon through our gateway.
Recent Posts
Recent Reviews
FAQ
Common questions about Qwen2-VL-72B-Instruct
Qwen2-VL-72B-Instruct was released on August 29, 2024 by Qwen.
Qwen2-VL-72B-Instruct was created by Qwen.
Qwen2-VL-72B-Instruct has 73.4 billion parameters.
Qwen2-VL-72B-Instruct is released under the tongyi-qianwen license.
Qwen2-VL-72B-Instruct has a knowledge cutoff of June 2023. This means the model was trained on data up to this date and may not have information about events after this time.
Yes, Qwen2-VL-72B-Instruct is a multimodal model that can process both text and images as input.