Model Comparison

DeepSeek-R1-0528 vs Granite 3.3 8B Instruct

DeepSeek-R1-0528 significantly outperforms across most benchmarks. Granite 3.3 8B Instruct is 1.8x cheaper per token.

Performance Benchmarks

Comparative analysis across standard metrics

1 benchmarks

DeepSeek-R1-0528 outperforms in 1 benchmarks (AIME 2024), while Granite 3.3 8B Instruct is better at 0 benchmarks.

DeepSeek-R1-0528 significantly outperforms across most benchmarks.

Mon Apr 20 2026 • llm-stats.com

Arena Performance

Human preference votes

Pricing Analysis

Price comparison per million tokens

Granite 3.3 8B Instruct costs less

For input processing, DeepSeek-R1-0528 ($0.50/1M tokens) costs the same as Granite 3.3 8B Instruct ($0.50/1M tokens).

For output processing, DeepSeek-R1-0528 ($2.15/1M tokens) is 4.3x more expensive than Granite 3.3 8B Instruct ($0.50/1M tokens).

In conclusion, DeepSeek-R1-0528 is more expensive than Granite 3.3 8B Instruct.*

* Using a 3:1 ratio of input to output tokens

Lowest available price from all providers
Mon Apr 20 2026 • llm-stats.com
DeepSeek
DeepSeek-R1-0528
Input tokens$0.50
Output tokens$2.15
Best providerDeepinfra
IBM
Granite 3.3 8B Instruct
Input tokens$0.50
Output tokens$0.50
Best providerReplicate
Notice missing or incorrect data?Start an Issue

Model Size

Parameter count comparison

663.0B diff

DeepSeek-R1-0528 has 663.0B more parameters than Granite 3.3 8B Instruct, making it 8287.5% larger.

DeepSeek
DeepSeek-R1-0528
671.0Bparameters
IBM
Granite 3.3 8B Instruct
8.0Bparameters
671.0B
DeepSeek-R1-0528
8.0B
Granite 3.3 8B Instruct

Context Window

Maximum input and output token capacity

DeepSeek-R1-0528 accepts 131,072 input tokens compared to Granite 3.3 8B Instruct's 128,000 tokens. DeepSeek-R1-0528 can generate longer responses up to 131,072 tokens, while Granite 3.3 8B Instruct is limited to 8,192 tokens.

DeepSeek
DeepSeek-R1-0528
Input131,072 tokens
Output131,072 tokens
IBM
Granite 3.3 8B Instruct
Input128,000 tokens
Output8,192 tokens
Mon Apr 20 2026 • llm-stats.com

Input Capabilities

Supported data types and modalities

Granite 3.3 8B Instruct supports multimodal inputs, whereas DeepSeek-R1-0528 does not.

Granite 3.3 8B Instruct can handle both text and other forms of data like images, making it suitable for multimodal applications.

DeepSeek-R1-0528

Text
Images
Audio
Video

Granite 3.3 8B Instruct

Text
Images
Audio
Video

License

Usage and distribution terms

DeepSeek-R1-0528 is licensed under MIT, while Granite 3.3 8B Instruct uses Apache 2.0.

License differences may affect how you can use these models in commercial or open-source projects.

DeepSeek-R1-0528

MIT

Open weights

Granite 3.3 8B Instruct

Apache 2.0

Open weights

Release Timeline

When each model was launched

DeepSeek-R1-0528 was released on 2025-05-28, while Granite 3.3 8B Instruct was released on 2025-04-16.

DeepSeek-R1-0528 is 1 month newer than Granite 3.3 8B Instruct.

DeepSeek-R1-0528

May 28, 2025

10 months ago

1mo newer
Granite 3.3 8B Instruct

Apr 16, 2025

1.0 years ago

Knowledge Cutoff

When training data ends

Granite 3.3 8B Instruct has a documented knowledge cutoff of 2024-04-01, while DeepSeek-R1-0528's cutoff date is not specified.

We can confirm Granite 3.3 8B Instruct's training data extends to 2024-04-01, but cannot make a direct comparison without DeepSeek-R1-0528's cutoff date.

DeepSeek-R1-0528

Granite 3.3 8B Instruct

Apr 2024

Provider Availability

DeepSeek-R1-0528 is available from DeepInfra, DeepSeek, Novita. Granite 3.3 8B Instruct is available from Replicate.

DeepSeek-R1-0528

deepinfra logo
Deepinfra
Input Price:Input: $0.50/1MOutput Price:Output: $2.15/1M
deepseek logo
DeepSeek
Input Price:Input: $0.55/1MOutput Price:Output: $2.19/1M
novita logo
Novita
Input Price:Input: $0.70/1MOutput Price:Output: $2.50/1M

Granite 3.3 8B Instruct

replicate logo
Replicate
Input Price:Input: $0.50/1MOutput Price:Output: $0.50/1M
* Prices shown are per million tokens

Outputs Comparison

Notice missing or incorrect data?Start an Issue discussion

Key Takeaways

Larger context window (131,072 tokens)
Higher AIME 2024 score (91.4% vs 81.2%)
Supports multimodal inputs
Less expensive output tokens

Detailed Comparison

AI Model Comparison Table
Feature
DeepSeek
DeepSeek-R1-0528
IBM
Granite 3.3 8B Instruct

FAQ

Common questions about DeepSeek-R1-0528 vs Granite 3.3 8B Instruct

DeepSeek-R1-0528 significantly outperforms across most benchmarks. DeepSeek-R1-0528 is made by DeepSeek and Granite 3.3 8B Instruct is made by IBM. The best choice depends on your use case — compare their benchmark scores, pricing, and capabilities above.
DeepSeek-R1-0528 scores MMLU-Redux: 93.4%, SimpleQA: 92.3%, AIME 2024: 91.4%, AIME 2025: 87.5%, MMLU-Pro: 85.0%. Granite 3.3 8B Instruct scores HumanEval: 89.7%, AttaQ: 88.5%, HumanEval+: 86.1%, AIME 2024: 81.2%, GSM8k: 80.9%.
Both models cost $0.50 per million input tokens.
DeepSeek-R1-0528 supports 131K tokens and Granite 3.3 8B Instruct supports 128K tokens. A larger context window lets you process longer documents, conversations, or codebases in a single request.
Key differences include context window (131K vs 128K), multimodal support (no vs yes), licensing (MIT vs Apache 2.0). See the full comparison above for benchmark-by-benchmark results.
DeepSeek-R1-0528 is developed by DeepSeek and Granite 3.3 8B Instruct is developed by IBM.