Google logo

Gemma 3 12B

Google·
Mar 2025
12B params
Multimodal

Overview

Gemma 3 12B is a 12-billion-parameter vision-language model from Google, handling text and image input and generating text output. It features a 128K context window, multilingual support, and open weights. Suitable for question answering, summarization, reasoning, and image understanding tasks.

Gemma 3 12B was released on March 12, 2025. API access is available through DeepInfra.

Performance

Timeline

ReleasedUnknown
Knowledge CutoffUnknown

Specifications

Parameters
12.0B
License
Gemma
Training Data
Unknown
Tags
tuning:instruct

Related Models

Compare Gemma 3 12B to other models by quality (GPQA score) vs cost. Higher scores and lower costs represent better value.

Performance visualization loading...

Gathering benchmark data from similar models

Benchmarks

Gemma 3 12B Performance Across Datasets

Scores sourced from the model's scorecard, paper, or official blog posts

LLM Stats Logollm-stats.com - Sat Dec 20 2025
Notice missing or incorrect data?Start an Issue discussion

Pricing

Pricing, performance, and capabilities for Gemma 3 12B across different providers:

ProviderInput ($/M)Output ($/M)Max InputMax OutputLatency (s)ThroughputQuantizationInputOutput
DeepInfra logo
DeepInfra
$0.05$0.10131.1K131.1K0.233.0 tok/s
Text
Image
Audio
Video
Text
Image
Audio
Video

Example Outputs

Recent Posts

Recent Reviews

API Access

API Access Coming Soon

API access for Gemma 3 12B will be available soon through our gateway.

FAQ

Common questions about Gemma 3 12B

Gemma 3 12B was released on March 12, 2025.
Gemma 3 12B has 12.0 billion parameters.