Llama V3p1 405b pricing
If you are budgeting for Llama V3p1 405b, start with the numbers below. We index 1 provider price. Cheapest input is $3.00 per million tokens. Cheapest output is $3.00 per million tokens. The model lists a 128K context window in our data.
Pricing across providers
Use this table to read Llama V3p1 405b list prices. We show 1 source right now. Lowest input in the grid: Fireworks AI. The chart below the table helps when output prices are much higher than input prices.
| Provider | Input / 1M | Output / 1M | Cached input | Batch |
|---|---|---|---|---|
FA Fireworks AI | $3.00 | $3.00 | — | — |
Input vs output · 1M tokens
Cost calculator
Pick any provider row and type how many tokens you expect per day, week, or year. We turn that into rough dollar totals for Llama V3p1 405b.
0.300000¢ / req
0.150000¢ / req
Model specifications
Quick spec sheet for Llama V3p1 405b before you dive back into pricing. Reported under Meta.
- Context window
- 128,000 tokens
- Max output
- 16,384 tokens
- Vision (images)
- No
- Tool / function calling
- Yes
- Streaming
- No
- Released
- N/A
- Primary provider
- Meta
- Model family
- N/A
Compare Llama V3p1 405b
Open a pair page to see Llama V3p1 405b next to another model with a shared provider matrix. 6 shortcuts below.
- Llama V3p1 405b vs Llama 3.1 70B
Compare pricing side by side
- Llama V3p1 405b vs Llama 3.1 8B
Compare pricing side by side
- Llama V3p1 405b vs GPT-4o
Compare pricing side by side
- Llama V3p1 405b vs GPT-4o mini
Compare pricing side by side
- Llama V3p1 405b vs Claude Sonnet 4.6
Compare pricing side by side
- Llama V3p1 405b vs Gemini 2.0 Flash
Compare pricing side by side
Frequently asked questions
Read these after the table if you want plain language around Llama V3p1 405b rates.
Also from Meta
Other models by Meta with live pricing in our catalog.