Gemma 3 4B pricing
Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling. Below you will find 2 current rows with input and output dollars per million. Right now the lowest input is $0.040 and the lowest output is $0.080.
Pricing across providers
All figures are list prices per million tokens unless a column says otherwise. 2 offers are listed for Gemma 3 4B. Best input in this view: Openrouter.
| Provider | Input / 1M | Output / 1M | Cached input | Batch |
|---|---|---|---|---|
O Openrouter | $0.040 | $0.080 | — | — |
D DeepInfra | $0.040 | $0.080 | — | — |
Input vs output · per provider
Cost calculator
Use this block to stress test Gemma 3 4B cost without a spreadsheet. All estimates come from public list rates in this page.
Provider
0.004000¢ / req
0.004000¢ / req
Model specifications
These fields describe Gemma 3 4B as we store it (source: Google). They sit next to price so buyers can check limits and tools in one place.
- Context window
- 131,072 tokens
- Max output
- 131,072 tokens
- Vision (images)
- Yes
- Tool / function calling
- Yes
- Streaming
- No
- Released
- Mar 2025
- Primary provider
- Model family
- N/A
Compare Gemma 3 4B
Jump into a comparison when you want one table for two models instead of two tabs. 6 curated matches for Gemma 3 4B.
- Gemma 3 4B vs Gemini 2.0 Flash
Compare pricing side by side
- Gemma 3 4B vs Gemini 1.5 Pro
Compare pricing side by side
- Gemma 3 4B vs Gemini 1.5 Flash
Compare pricing side by side
- Gemma 3 4B vs GPT-4o
Compare pricing side by side
- Gemma 3 4B vs GPT-4o mini
Compare pricing side by side
- Gemma 3 4B vs Claude Sonnet 4.6
Compare pricing side by side
Frequently asked questions
Quick frequently asked items for Gemma 3 4B pricing and limits. The short model note from our index: Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilitie...
Also from Google
Other models by Google with live pricing in our catalog.