Qwen2.5-Coder 32B Instruct
Overview
Qwen2.5-Coder is a specialized coding model trained on 5.5 trillion tokens of code data, supporting 92 programming languages with a 128K context window. It excels in code generation, completion, repair, and multi-programming tasks while maintaining strong performance in mathematics and general capabilities.
Qwen2.5-Coder 32B Instruct was released on September 19, 2024. API access is available through 4 providers, including Lambda, DeepInfra and others.
Performance
Timeline
Specifications
Benchmarks
Qwen2.5-Coder 32B Instruct Performance Across Datasets
Scores sourced from the model's scorecard, paper, or official blog posts
Pricing
Pricing, performance, and capabilities for Qwen2.5-Coder 32B Instruct across different providers:
| Provider | Input ($/M) | Output ($/M) | Max Input | Max Output | Latency (s) | Throughput | Quantization | Input | Output |
|---|---|---|---|---|---|---|---|---|---|
Lambda | $0.09 | $0.09 | 128.0K | 128.0K | 0.5 | 42.0 c/s | — | Text Image Audio Video | Text Image Audio Video |
DeepInfra | $0.18 | $0.18 | 128.0K | 128.0K | 0.5 | 44.0 c/s | — | Text Image Audio Video | Text Image Audio Video |
Hyperbolic | $0.20 | $0.20 | 128.0K | 128.0K | 0.5 | 100.0 c/s | — | Text Image Audio Video | Text Image Audio Video |
Fireworks | $0.89 | $0.89 | 128.0K | 128.0K | 0.26 | 110.0 c/s | — | Text Image Audio Video | Text Image Audio Video |
Price Comparison for Qwen2.5-Coder 32B Instruct
Price per 1M input tokens (USD), lower is better
Throughput Comparison for Qwen2.5-Coder 32B Instruct
Tokens per second, higher is better
Latency Comparison for Qwen2.5-Coder 32B Instruct
Time to first token (s), lower is better
API Access
API Access Coming Soon
API access for Qwen2.5-Coder 32B Instruct will be available soon through our gateway.
Recent Posts
Recent Reviews
FAQ
Common questions about Qwen2.5-Coder 32B Instruct
