QwQ-32B-Preview
Overview
An experimental research model focused on advancing AI reasoning capabilities, particularly excelling in mathematics and programming. Features deep introspection and self-questioning abilities while having some limitations in language mixing and recursive reasoning patterns.
QwQ-32B-Preview was released on November 28, 2024. API access is available through 4 providers, including DeepInfra, Hyperbolic and others.
Performance
Timeline
Specifications
Benchmarks
QwQ-32B-Preview Performance Across Datasets
Scores sourced from the model's scorecard, paper, or official blog posts
Pricing
Pricing, performance, and capabilities for QwQ-32B-Preview across different providers:
| Provider | Input ($/M) | Output ($/M) | Max Input | Max Output | Latency (s) | Throughput | Quantization | Input | Output |
|---|---|---|---|---|---|---|---|---|---|
DeepInfra | $0.15 | $0.60 | 32.8K | 32.8K | 0.44 | 76.04 c/s | — | Text Image Audio Video | Text Image Audio Video |
Hyperbolic | $0.20 | $0.20 | 32.8K | 32.8K | 1.05 | 31.9 c/s | — | Text Image Audio Video | Text Image Audio Video |
Fireworks | $0.89 | $0.89 | 32.8K | 32.8K | 0.53 | 99.15 c/s | — | Text Image Audio Video | Text Image Audio Video |
Together | $1.20 | $1.20 | 32.8K | 32.8K | 0.74 | 62.14 c/s | — | Text Image Audio Video | Text Image Audio Video |
Price Comparison for QwQ-32B-Preview
Price per 1M input tokens (USD), lower is better
Throughput Comparison for QwQ-32B-Preview
Tokens per second, higher is better
Latency Comparison for QwQ-32B-Preview
Time to first token (s), lower is better
API Access
API Access Coming Soon
API access for QwQ-32B-Preview will be available soon through our gateway.
Recent Posts
Recent Reviews
FAQ
Common questions about QwQ-32B-Preview
