Gemini 2.5 Flash-Lite
Overview
Overview
Gemini 2.5 Flash-Lite is a model developed by Google DeepMind, designed to handle various tasks including reasoning, science, mathematics, code generation, and more. It features advanced capabilities in multilingual performance and long context understanding. It is optimized for low latency use cases, supporting multimodal input with a 1 million-token context length.
Gemini 2.5 Flash-Lite was released on June 17, 2025. API access is available through Google.
Performance
Timeline
Specifications
Benchmarks
Benchmarks
Gemini 2.5 Flash-Lite Performance Across Datasets
Scores sourced from the model's scorecard, paper, or official blog posts
Pricing
Pricing
Pricing, performance, and capabilities for Gemini 2.5 Flash-Lite across different providers:
| Provider | Input ($/M) | Output ($/M) | Max Input | Max Output | Latency (s) | Throughput | Quantization | Input | Output |
|---|---|---|---|---|---|---|---|---|---|
Google | $0.10 | $0.40 | 1.0M | 65.5K | 0.44 | 5.69 c/s | — | Text Image Audio Video | Text Image Audio Video |
API Access
API Access Coming Soon
API access for Gemini 2.5 Flash-Lite will be available soon through our gateway.
Recent Posts
Recent Reviews
FAQ
Common questions about Gemini 2.5 Flash-Lite