GPT-4.1 Nano pricing
For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion. Live index: 5 priced offers. Best input $0.100 per million tokens from Openrouter. Best output $0.400 per million tokens from Openrouter.
Pricing across providers
Use this table to read GPT-4.1 Nano list prices. We show 5 sources right now. Lowest input in the grid: Openrouter. The chart below the table helps when output prices are much higher than input prices.
| Provider | Input / 1M | Output / 1M | Cached input | Batch |
|---|---|---|---|---|
O Openrouter | $0.100 | $0.400 | $0.025 | — |
VA Vercel Ai Gateway | $0.100 | $0.400 | $0.025 | — |
A Azure | $0.100 | $0.400 | $0.025 | — |
O OpenAInative | $0.100 | $0.400 | $0.025 | — |
R Replicate | $0.100 | $0.400 | — | — |
Input vs output · per provider
Cost calculator
Pick any of the providers above and type how many tokens you expect per day, week, or year. We turn that into rough dollar totals for GPT-4.1 Nano.
Provider
0.010000¢ / req
0.020000¢ / req
Model specifications
Context length, caps, and capability flags for GPT-4.1 Nano. Family: GPT-4. Values follow the main provider (OpenAI) record in our index.
- Context window
- 1,047,576 tokens
- Max output
- 32,768 tokens
- Vision (images)
- Yes
- Tool / function calling
- Yes
- Streaming
- No
- Released
- Apr 2025
- Primary provider
- OpenAI
- Model family
- GPT-4
Compare GPT-4.1 Nano
Open a pair page to see GPT-4.1 Nano next to another model with a shared provider matrix. 6 shortcuts below.
- GPT-4.1 Nano vs GPT-4o
GPT-4.1 Nano 96% cheaper on output
- GPT-4.1 Nano vs GPT-4o mini
GPT-4.1 Nano 33% cheaper on output
- GPT-4.1 Nano vs o3
GPT-4.1 Nano 95% cheaper on output
- GPT-4.1 Nano vs Claude Sonnet 4.6
GPT-4.1 Nano 97% cheaper on output
- GPT-4.1 Nano vs Gemini 2.0 Flash
Same output pricing
- GPT-4.1 Nano vs Llama 3.1 70B
Compare pricing side by side
Frequently asked questions
Answers pull from the same numbers you see on this page. The short model note from our index: For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MM...
Also from OpenAI
Other models by OpenAI with live pricing in our catalog.