- Organizations
- Meta
- Llama 3.3 70B Instruct
Llama 3.3 70B Instruct: Benchmarks, Pricing & Context Window
Llama 3.3 70B Instruct is a language model from Meta, released in December 2024.
Llama 3.3 is a multilingual large language model optimized for dialogue use cases across multiple languages. It is a pretrained and instruction-tuned generative model with 70 billion parameters, outperforming many open-source and closed
Llama 3.3 70B Instruct pricing
Providers
Llama 3.3 70B Instruct starts at $0.200 per million input tokens and $0.200 per million output tokens via Lambda. See all 9 providers below with their per-token pricing, latency, throughput, and modality support.
| Provider | Input $/M | Output $/M | Max Input | Max Output | Latency s | Throughput | Quant | Input | Output |
|---|---|---|---|---|---|---|---|---|---|
| $0.200 | $0.200 | 128.0K | 128.0K | 0.65 | 42 c/s | — | |||
| $0.230 | $0.400 | 128.0K | 128.0K | 0.65 | 37 c/s | — | |||
| $0.400 | $0.400 | 128.0K | 128.0K | 0.65 | 42 c/s | — | |||
| $0.590 | $7.90 | 128.0K | 128.0K | 0.65 | 268 c/s | — | |||
| $0.600 | $1.20 | 128.0K | 128.0K | 0.65 | 1096 c/s | — | |||
| $0.700 | $0.800 | 128.0K | 128.0K | 0.65 | 2220 c/s | — | |||
| $0.720 | $0.720 | 128.0K | 128.0K | 0.50 | 100 c/s | — | |||
| $0.880 | $0.880 | 128.0K | 128.0K | 0.65 | 65 c/s | — | |||
| $0.890 | $0.890 | 128.0K | 128.0K | 0.65 | 197 c/s | — |
Llama 3.3 70B Instruct API
API access coming soon
Llama 3.3 70B Instruct will be available through our gateway shortly.
Llama 3.3 70B Instruct examples
Recent arena outputs from Llama 3.3 70B Instruct, picked from the highest-ranked matchups.
Llama 3.3 70B Instruct license
Llama 3.3 70B Instruct is released under the Llama 3.3 Community License Agreement license, which restricts commercial use, has 70.0B parameters.
- License
- Llama 3.3 Community License Agreement
- Non-commercial
- Parameters
- 70.0B
FAQ
Common questions about Llama 3.3 70B Instruct.