Llama 3.2 90B Instruct
Overview
Overview
Llama 3.2 90B is a large multimodal language model optimized for visual recognition, image reasoning, and captioning tasks. It supports a context length of 128,000 tokens and is designed for deployment on edge and mobile devices, offering state-of-the-art performance in image understanding and generative tasks.
Llama 3.2 90B Instruct was released on September 25, 2024. API access is available through 5 providers, including DeepInfra, Bedrock and others.
Performance
Timeline
Specifications
Benchmarks
Benchmarks
Llama 3.2 90B Instruct Performance Across Datasets
Scores sourced from the model's scorecard, paper, or official blog posts
Pricing
Pricing
Pricing, performance, and capabilities for Llama 3.2 90B Instruct across different providers:
| Provider | Input ($/M) | Output ($/M) | Max Input | Max Output | Latency (s) | Throughput | Quantization | Input | Output |
|---|---|---|---|---|---|---|---|---|---|
DeepInfra | $0.35 | $0.40 | 128.0K | 128.0K | 0.5 | 24.0 c/s | — | Text Image Audio Video | Text Image Audio Video |
Bedrock | $0.72 | $0.72 | 128.0K | 128.0K | 0.5 | 100.0 c/s | — | Text Image Audio Video | Text Image Audio Video |
Fireworks | $0.89 | $0.89 | 128.0K | 128.0K | 0.5 | 50.0 c/s | — | Text Image Audio Video | Text Image Audio Video |
Together | $1.20 | $1.20 | 128.0K | 128.0K | 0.5 | 57.0 c/s | — | Text Image Audio Video | Text Image Audio Video |
Hyperbolic | $2.00 | $2.00 | 128.0K | 128.0K | 0.5 | 42.0 c/s | — | Text Image Audio Video | Text Image Audio Video |
Price Comparison for Llama 3.2 90B Instruct
Price per 1M input tokens (USD), lower is better
Throughput Comparison for Llama 3.2 90B Instruct
Tokens per second, higher is better
Latency Comparison for Llama 3.2 90B Instruct
Time to first token (s), lower is better
API Access
API Access Coming Soon
API access for Llama 3.2 90B Instruct will be available soon through our gateway.
Recent Posts
Recent Reviews
FAQ
Common questions about Llama 3.2 90B Instruct