Google logo

Gemma 3 4B

Overview

Overview

Gemma 3 4B is a 4-billion-parameter vision-language model from Google, handling text and image input and generating text output. It features a 128K context window, multilingual support, and open weights. Suitable for question answering, summarization, reasoning, and image understanding tasks.

Gemma 3 4B was released on March 12, 2025. API access is available through DeepInfra.

Performance

Timeline

ReleasedUnknown
Knowledge CutoffUnknown

Specifications

Parameters
4.0B
License
Gemma
Training Data
Unknown
Tags
tuning:instruct

Benchmarks

Benchmarks

Gemma 3 4B Performance Across Datasets

Scores sourced from the model's scorecard, paper, or official blog posts

LLM Stats Logollm-stats.com - Thu Feb 05 2026
Notice missing or incorrect data?Start an Issue discussion

Pricing

Pricing

Pricing, performance, and capabilities for Gemma 3 4B across different providers:

ProviderInput ($/M)Output ($/M)Max InputMax OutputLatency (s)ThroughputQuantizationInputOutput
DeepInfra logo
DeepInfra
$0.02$0.04131.1K131.1K
0.2
33.0 c/s
Text
Image
Audio
Video
Text
Image
Audio
Video

API Access

API Access Coming Soon

API access for Gemma 3 4B will be available soon through our gateway.

Recent Posts

Recent Reviews

FAQ

Common questions about Gemma 3 4B

Gemma 3 4B was released on March 12, 2025 by Google.
Gemma 3 4B was created by Google.
Gemma 3 4B has 4.0 billion parameters.
Gemma 3 4B is released under the Gemma license.
Gemma 3 4B has a knowledge cutoff of August 2024. This means the model was trained on data up to this date and may not have information about events after this time.
Yes, Gemma 3 4B is a multimodal model that can process both text and images as input.