Llama 3.1 8B Instruct
Overview
Overview
Llama 3.1 8B Instruct is a multilingual large language model optimized for dialogue use cases. It features a 128K context length, state-of-the-art tool use, and strong reasoning capabilities.
Llama 3.1 8B Instruct was released on July 23, 2024. API access is available through 9 providers, including Lambda, DeepInfra and others.
Performance
Timeline
Specifications
Benchmarks
Benchmarks
Llama 3.1 8B Instruct Performance Across Datasets
Scores sourced from the model's scorecard, paper, or official blog posts
Pricing
Pricing
Pricing, performance, and capabilities for Llama 3.1 8B Instruct across different providers:
| Provider | Input ($/M) | Output ($/M) | Max Input | Max Output | Latency (s) | Throughput | Quantization | Input | Output |
|---|---|---|---|---|---|---|---|---|---|
Lambda | $0.03 | $0.03 | 131.1K | 131.1K | 0.5 | 42.0 c/s | — | Text Image Audio Video | Text Image Audio Video |
DeepInfra | $0.05 | $0.05 | 131.1K | 131.1K | 0.5 | 118.0 c/s | — | Text Image Audio Video | Text Image Audio Video |
Groq | $0.05 | $0.08 | 131.1K | 131.1K | 0.5 | 750.0 c/s | — | Text Image Audio Video | Text Image Audio Video |
Sambanova | $0.10 | $0.20 | 131.1K | 131.1K | 0.5 | 1050.0 c/s | — | Text Image Audio Video | Text Image Audio Video |
Cerebras | $0.10 | $0.10 | 131.1K | 131.1K | 0.2 | 2047.0 c/s | — | Text Image Audio Video | Text Image Audio Video |
Hyperbolic | $0.10 | $0.10 | 131.1K | 131.1K | 0.5 | 200.0 c/s | — | Text Image Audio Video | Text Image Audio Video |
Together | $0.20 | $0.20 | 131.1K | 131.1K | 0.5 | 194.0 c/s | — | Text Image Audio Video | Text Image Audio Video |
Fireworks | $0.20 | $0.20 | 131.1K | 131.1K | 0.5 | 292.0 c/s | — | Text Image Audio Video | Text Image Audio Video |
Bedrock | $0.22 | $0.22 | 131.1K | 131.1K | 0.5 | 100.0 c/s | — | Text Image Audio Video | Text Image Audio Video |
Price Comparison for Llama 3.1 8B Instruct
Price per 1M input tokens (USD), lower is better
Throughput Comparison for Llama 3.1 8B Instruct
Tokens per second, higher is better
Latency Comparison for Llama 3.1 8B Instruct
Time to first token (s), lower is better
API Access
API Access Coming Soon
API access for Llama 3.1 8B Instruct will be available soon through our gateway.
Recent Posts
Recent Reviews
FAQ
Common questions about Llama 3.1 8B Instruct