Qwen logo

Qwen2-VL-72B-Instruct

Qwen
qwen2-vl-72bVariant

Overview

An instruction-tuned, large multimodal model that excels at visual understanding and step-by-step reasoning. It supports image and video input, with dynamic resolution handling and improved positional embeddings (M-ROPE), enabling advanced capabilities such as complex problem solving, multilingual text recognition in images, and agent-like interactions in video contexts.

Qwen2-VL-72B-Instruct was released on August 29, 2024.

Performance

Timeline

Release DateUnknown
Knowledge CutoffUnknown

Other Details

Parameters
73.4B
License
tongyi-qianwen
Training Data
Unknown
Tags
tuning:instruct

Related Models

Compare Qwen2-VL-72B-Instruct to other models by quality (GPQA score) vs cost. Higher scores and lower costs represent better value.

Performance visualization loading...

Gathering benchmark data from similar models

Benchmarks

Qwen2-VL-72B-Instruct Performance Across Datasets

Scores sourced from the model's scorecard, paper, or official blog posts

LLM Stats Logollm-stats.com - Mon Dec 08 2025
Notice missing or incorrect data?Start an Issue discussion

Pricing

Pricing, performance, and capabilities for Qwen2-VL-72B-Instruct across different providers:

No pricing information available for this model.

Example Outputs

Recent Posts

Recent Reviews

API Access

API Access Coming Soon

API access for Qwen2-VL-72B-Instruct will be available soon through our gateway.

FAQ

Common questions about Qwen2-VL-72B-Instruct

Qwen2-VL-72B-Instruct was released on August 29, 2024.
Qwen2-VL-72B-Instruct has 73.4 billion parameters.