Google logo

Gemma 3 4B

Overview

Gemma 3 4B is a 4-billion-parameter vision-language model from Google, handling text and image input and generating text output. It features a 128K context window, multilingual support, and open weights. Suitable for question answering, summarization, reasoning, and image understanding tasks.

Gemma 3 4B was released on March 12, 2025. API access is available through DeepInfra.

Performance

Timeline

ReleasedUnknown
Knowledge CutoffUnknown

Specifications

Parameters
4.0B
License
Gemma
Training Data
Unknown
Tags
tuning:instruct

Benchmarks

Gemma 3 4B Performance Across Datasets

Scores sourced from the model's scorecard, paper, or official blog posts

LLM Stats Logollm-stats.com - Mon Dec 22 2025
Notice missing or incorrect data?Start an Issue discussion

Pricing

Pricing, performance, and capabilities for Gemma 3 4B across different providers:

ProviderInput ($/M)Output ($/M)Max InputMax OutputLatency (s)ThroughputQuantizationInputOutput
DeepInfra logo
DeepInfra
$0.02$0.04131.1K131.1K0.233.0 tok/s
Text
Image
Audio
Video
Text
Image
Audio
Video

API Access

API Access Coming Soon

API access for Gemma 3 4B will be available soon through our gateway.

Recent Posts

Recent Reviews

FAQ

Common questions about Gemma 3 4B

Gemma 3 4B was released on March 12, 2025.
Gemma 3 4B has 4.0 billion parameters.