Model Comparison

Qwen3 VL 30B A3B Thinking vs QwQ-32B-Preview

Qwen3 VL 30B A3B Thinking significantly outperforms across most benchmarks. QwQ-32B-Preview is 2.4x cheaper per token.

Performance Benchmarks

Comparative analysis across standard metrics

1 benchmarks

Qwen3 VL 30B A3B Thinking outperforms in 1 benchmarks (GPQA), while QwQ-32B-Preview is better at 0 benchmarks.

Qwen3 VL 30B A3B Thinking significantly outperforms across most benchmarks.

Sun May 10 2026 • llm-stats.com

Arena Performance

Human preference votes

Pricing Analysis

Price comparison per million tokens

QwQ-32B-Preview costs less

For input processing, Qwen3 VL 30B A3B Thinking ($0.20/1M tokens) is 1.3x more expensive than QwQ-32B-Preview ($0.15/1M tokens).

For output processing, Qwen3 VL 30B A3B Thinking ($0.99/1M tokens) is 4.9x more expensive than QwQ-32B-Preview ($0.20/1M tokens).

In conclusion, Qwen3 VL 30B A3B Thinking is more expensive than QwQ-32B-Preview.*

* Using a 3:1 ratio of input to output tokens

Lowest available price from all providers
Sun May 10 2026 • llm-stats.com
Alibaba Cloud / Qwen Team
Qwen3 VL 30B A3B Thinking
Input tokens$0.20
Output tokens$0.99
Best providerNovita
Alibaba Cloud / Qwen Team
QwQ-32B-Preview
Input tokens$0.15
Output tokens$0.20
Best providerDeepinfra
Notice missing or incorrect data?Start an Issue

Model Size

Parameter count comparison

1.5B diff

QwQ-32B-Preview has 1.5B more parameters than Qwen3 VL 30B A3B Thinking, making it 4.8% larger.

Alibaba Cloud / Qwen Team
Qwen3 VL 30B A3B Thinking
31.0Bparameters
Alibaba Cloud / Qwen Team
QwQ-32B-Preview
32.5Bparameters
31.0B
Qwen3 VL 30B A3B Thinking
32.5B
QwQ-32B-Preview

Context Window

Maximum input and output token capacity

Qwen3 VL 30B A3B Thinking accepts 131,072 input tokens compared to QwQ-32B-Preview's 32,768 tokens. Both models can generate responses up to 32,768 tokens.

Alibaba Cloud / Qwen Team
Qwen3 VL 30B A3B Thinking
Input131,072 tokens
Output32,768 tokens
Alibaba Cloud / Qwen Team
QwQ-32B-Preview
Input32,768 tokens
Output32,768 tokens
Sun May 10 2026 • llm-stats.com

Input Capabilities

Supported data types and modalities

Qwen3 VL 30B A3B Thinking supports multimodal inputs, whereas QwQ-32B-Preview does not.

Qwen3 VL 30B A3B Thinking can handle both text and other forms of data like images, making it suitable for multimodal applications.

Qwen3 VL 30B A3B Thinking

Text
Images
Audio
Video

QwQ-32B-Preview

Text
Images
Audio
Video

License

Usage and distribution terms

Both models are licensed under Apache 2.0.

Both models share the same licensing terms, providing consistent usage rights.

Qwen3 VL 30B A3B Thinking

Apache 2.0

Open weights

QwQ-32B-Preview

Apache 2.0

Open weights

Release Timeline

When each model was launched

Qwen3 VL 30B A3B Thinking was released on 2025-09-22, while QwQ-32B-Preview was released on 2024-11-28.

Qwen3 VL 30B A3B Thinking is 10 months newer than QwQ-32B-Preview.

Qwen3 VL 30B A3B Thinking

Sep 22, 2025

7 months ago

9mo newer
QwQ-32B-Preview

Nov 28, 2024

1.4 years ago

Knowledge Cutoff

When training data ends

QwQ-32B-Preview has a documented knowledge cutoff of 2024-11-28, while Qwen3 VL 30B A3B Thinking's cutoff date is not specified.

We can confirm QwQ-32B-Preview's training data extends to 2024-11-28, but cannot make a direct comparison without Qwen3 VL 30B A3B Thinking's cutoff date.

Qwen3 VL 30B A3B Thinking

QwQ-32B-Preview

Nov 2024

Provider Availability

Qwen3 VL 30B A3B Thinking is available from Novita, DeepInfra. QwQ-32B-Preview is available from DeepInfra, Hyperbolic, Fireworks, Together.

Qwen3 VL 30B A3B Thinking

novita logo
Novita
Input Price:Input: $0.20/1MOutput Price:Output: $1.00/1M
deepinfra logo
Deepinfra
Input Price:Input: $0.29/1MOutput Price:Output: $0.99/1M

QwQ-32B-Preview

deepinfra logo
Deepinfra
Input Price:Input: $0.15/1MOutput Price:Output: $0.60/1M
hyperbolic logo
Hyperbolic
Input Price:Input: $0.20/1MOutput Price:Output: $0.20/1M
fireworks logo
Fireworks
Input Price:Input: $0.89/1MOutput Price:Output: $0.89/1M
together logo
Together
Input Price:Input: $1.20/1MOutput Price:Output: $1.20/1M
* Prices shown are per million tokens

Outputs Comparison

Notice missing or incorrect data?Start an Issue discussion

Key Takeaways

Alibaba Cloud / Qwen Team

Qwen3 VL 30B A3B Thinking

View details

Alibaba Cloud / Qwen Team

Larger context window (131,072 tokens)
Supports multimodal inputs
Higher GPQA score (74.4% vs 65.2%)
Alibaba Cloud / Qwen Team

QwQ-32B-Preview

View details

Alibaba Cloud / Qwen Team

Less expensive input tokens
Less expensive output tokens

Detailed Comparison

AI Model Comparison Table
Feature
Alibaba Cloud / Qwen Team
Qwen3 VL 30B A3B Thinking
Alibaba Cloud / Qwen Team
QwQ-32B-Preview

FAQ

Common questions about Qwen3 VL 30B A3B Thinking vs QwQ-32B-Preview.

Which is better, Qwen3 VL 30B A3B Thinking or QwQ-32B-Preview?

Qwen3 VL 30B A3B Thinking significantly outperforms across most benchmarks. Qwen3 VL 30B A3B Thinking is made by Alibaba Cloud / Qwen Team and QwQ-32B-Preview is made by Alibaba Cloud / Qwen Team. The best choice depends on your use case — compare their benchmark scores, pricing, and capabilities above.

How does Qwen3 VL 30B A3B Thinking compare to QwQ-32B-Preview in benchmarks?

Qwen3 VL 30B A3B Thinking scores DocVQAtest: 95.0%, ScreenSpot: 94.7%, MMLU-Redux: 90.9%, MMBench-V1.1: 88.9%, MMLU: 87.6%. QwQ-32B-Preview scores MATH-500: 90.6%, GPQA: 65.2%, AIME 2024: 50.0%, LiveCodeBench: 50.0%.

Is Qwen3 VL 30B A3B Thinking cheaper than QwQ-32B-Preview?

QwQ-32B-Preview is 1.3x cheaper for input tokens. Qwen3 VL 30B A3B Thinking costs $0.20/M input and $0.99/M output via novita. QwQ-32B-Preview costs $0.15/M input and $0.20/M output via deepinfra.

What are the context window sizes for Qwen3 VL 30B A3B Thinking and QwQ-32B-Preview?

Qwen3 VL 30B A3B Thinking supports 131K tokens and QwQ-32B-Preview supports 33K tokens. A larger context window lets you process longer documents, conversations, or codebases in a single request.

What are the main differences between Qwen3 VL 30B A3B Thinking and QwQ-32B-Preview?

Key differences include context window (131K vs 33K), input pricing ($0.20 vs $0.15/M), multimodal support (yes vs no). See the full comparison above for benchmark-by-benchmark results.