Llama 3.2 3B Instruct
MetaOverview
Llama 3.2 3B Instruct is a large language model that supports a context length of 128K tokens and are state-of-the-art in their class for on-device use cases like summarization, instruction following, and rewriting tasks running locally at the edge.
Llama 3.2 3B Instruct was released on September 25, 2024. API access is available through DeepInfra.
Performance
Timeline
Other Details
Related Models
Compare Llama 3.2 3B Instruct to other models by quality (GPQA score) vs cost. Higher scores and lower costs represent better value.
Performance visualization loading...
Gathering benchmark data from similar models
Benchmarks
Llama 3.2 3B Instruct Performance Across Datasets
Scores sourced from the model's scorecard, paper, or official blog posts
Pricing
Pricing, performance, and capabilities for Llama 3.2 3B Instruct across different providers:
| Provider | Input ($/M) | Output ($/M) | Max Input | Max Output | Latency (s) | Throughput | Quantization | Input | Output |
|---|---|---|---|---|---|---|---|---|---|
DeepInfra | $0.01 | $0.02 | 128.0K | 128.0K | 0.24 | 171.5 tok/s | — | Text Image Audio Video | Text Image Audio Video |
Example Outputs
Recent Posts
Recent Reviews
API Access
API Access Coming Soon
API access for Llama 3.2 3B Instruct will be available soon through our gateway.
FAQ
Common questions about Llama 3.2 3B Instruct
