Model Comparison

Qwen3 VL 30B A3B Thinking vs QwQ-32B-Preview

Qwen3 VL 30B A3B Thinking significantly outperforms across most benchmarks. QwQ-32B-Preview is 2.4x cheaper per token.

Want to compare interactively?Try the playground

Performance Benchmarks

Comparative analysis across standard metrics

1 benchmarks

Qwen3 VL 30B A3B Thinking outperforms in 1 benchmarks (GPQA), while QwQ-32B-Preview is better at 0 benchmarks.

Qwen3 VL 30B A3B Thinking significantly outperforms across most benchmarks.

Sat May 30 2026 • llm-stats.com

Arena Performance

Human preference votes

Pricing Analysis

Price comparison per million tokens

QwQ-32B-Preview costs less

For input processing, Qwen3 VL 30B A3B Thinking ($0.20/1M tokens) is 1.3x more expensive than QwQ-32B-Preview ($0.15/1M tokens).

For output processing, Qwen3 VL 30B A3B Thinking ($0.99/1M tokens) is 4.9x more expensive than QwQ-32B-Preview ($0.20/1M tokens).

In conclusion, Qwen3 VL 30B A3B Thinking is more expensive than QwQ-32B-Preview.*

* Using a 3:1 ratio of input to output tokens

Lowest available price from all providers

Sat May 30 2026 • llm-stats.com

Qwen3 VL 30B A3B Thinking

Input tokens$0.20

Output tokens$0.99

Best providerNovita

QwQ-32B-Preview

Input tokens$0.15

Output tokens$0.20

Best providerDeepinfra

Notice missing or incorrect data?Start an Issue→

Model Size

Parameter count comparison

1.5B diff

QwQ-32B-Preview has 1.5B more parameters than Qwen3 VL 30B A3B Thinking, making it 4.8% larger.

Qwen3 VL 30B A3B Thinking

31.0Bparameters

QwQ-32B-Preview

32.5Bparameters

31.0B

Qwen3 VL 30B A3B Thinking

32.5B

QwQ-32B-Preview

Context Window

Maximum input and output token capacity

Qwen3 VL 30B A3B Thinking accepts 131,072 input tokens compared to QwQ-32B-Preview's 32,768 tokens. Both models can generate responses up to 32,768 tokens.

Qwen3 VL 30B A3B Thinking

Input131,072 tokens

Output32,768 tokens

QwQ-32B-Preview

Input32,768 tokens

Output32,768 tokens

Sat May 30 2026 • llm-stats.com

Input Capabilities

Supported data types and modalities

Qwen3 VL 30B A3B Thinking supports multimodal inputs, whereas QwQ-32B-Preview does not.

Qwen3 VL 30B A3B Thinking can handle both text and other forms of data like images, making it suitable for multimodal applications.

Qwen3 VL 30B A3B Thinking

Text

Images

Audio

Video

QwQ-32B-Preview

Text

Images

Audio

Video

License

Usage and distribution terms

Both models are licensed under Apache 2.0.

Both models share the same licensing terms, providing consistent usage rights.

Qwen3 VL 30B A3B Thinking

Apache 2.0

Open weights

QwQ-32B-Preview

Apache 2.0

Open weights

Release Timeline

When each model was launched

Qwen3 VL 30B A3B Thinking was released on 2025-09-22, while QwQ-32B-Preview was released on 2024-11-28.

Qwen3 VL 30B A3B Thinking is 10 months newer than QwQ-32B-Preview.

Qwen3 VL 30B A3B Thinking

Sep 22, 2025

8 months ago

9mo newer

QwQ-32B-Preview

Nov 28, 2024

1.5 years ago

Knowledge Cutoff

When training data ends

QwQ-32B-Preview has a documented knowledge cutoff of 2024-11-28, while Qwen3 VL 30B A3B Thinking's cutoff date is not specified.

We can confirm QwQ-32B-Preview's training data extends to 2024-11-28, but cannot make a direct comparison without Qwen3 VL 30B A3B Thinking's cutoff date.

Qwen3 VL 30B A3B Thinking

—

QwQ-32B-Preview

Nov 2024

Provider Availability

Qwen3 VL 30B A3B Thinking is available from Novita, DeepInfra. QwQ-32B-Preview is available from DeepInfra, Hyperbolic, Fireworks, Together.

Qwen3 VL 30B A3B Thinking

Novita

Input Price:Input: $0.20/1MOutput Price:Output: $1.00/1M

Deepinfra

Input Price:Input: $0.29/1MOutput Price:Output: $0.99/1M

QwQ-32B-Preview

Deepinfra

Input Price:Input: $0.15/1MOutput Price:Output: $0.60/1M

Hyperbolic

Input Price:Input: $0.20/1MOutput Price:Output: $0.20/1M

Fireworks

Input Price:Input: $0.89/1MOutput Price:Output: $0.89/1M

Together

Input Price:Input: $1.20/1MOutput Price:Output: $1.20/1M

* Prices shown are per million tokens

Outputs Comparison

Notice missing or incorrect data?Start an Issue discussion→

Key Takeaways

Qwen3 VL 30B A3B Thinking

View details

Alibaba Cloud / Qwen Team

Larger context window (131,072 tokens)

Supports multimodal inputs

Higher GPQA score (74.4% vs 65.2%)

QwQ-32B-Preview

View details

Alibaba Cloud / Qwen Team

Less expensive input tokens

Less expensive output tokens

Detailed Comparison

AI Model Comparison Table
Feature	Qwen3 VL 30B A3B Thinking	QwQ-32B-Preview

FAQ

Common questions about Qwen3 VL 30B A3B Thinking vs QwQ-32B-Preview.

Which is better, Qwen3 VL 30B A3B Thinking or QwQ-32B-Preview?

Qwen3 VL 30B A3B Thinking significantly outperforms across most benchmarks. Qwen3 VL 30B A3B Thinking is made by Alibaba Cloud / Qwen Team and QwQ-32B-Preview is made by Alibaba Cloud / Qwen Team. The best choice depends on your use case — compare their benchmark scores, pricing, and capabilities above.

How does Qwen3 VL 30B A3B Thinking compare to QwQ-32B-Preview in benchmarks?

Qwen3 VL 30B A3B Thinking scores DocVQAtest: 95.0%, ScreenSpot: 94.7%, MMLU-Redux: 90.9%, MMBench-V1.1: 88.9%, MMLU: 87.6%. QwQ-32B-Preview scores MATH-500: 90.6%, GPQA: 65.2%, AIME 2024: 50.0%, LiveCodeBench: 50.0%.

Is Qwen3 VL 30B A3B Thinking cheaper than QwQ-32B-Preview?

QwQ-32B-Preview is 1.3x cheaper for input tokens. Qwen3 VL 30B A3B Thinking costs $0.20/M input and $0.99/M output via novita. QwQ-32B-Preview costs $0.15/M input and $0.20/M output via deepinfra.

What are the context window sizes for Qwen3 VL 30B A3B Thinking and QwQ-32B-Preview?

Qwen3 VL 30B A3B Thinking supports 131K tokens and QwQ-32B-Preview supports 33K tokens. A larger context window lets you process longer documents, conversations, or codebases in a single request.

What are the main differences between Qwen3 VL 30B A3B Thinking and QwQ-32B-Preview?

Key differences include context window (131K vs 33K), input pricing ($0.20 vs $0.15/M), multimodal support (yes vs no). See the full comparison above for benchmark-by-benchmark results.