Qwen3 8B pricing
Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks and efficient dialogue. It supports seamless switching between "thinking" mode for math, coding, and logical inference, and "non-thinking" mode for general conversation. The model is fine-tuned for instruction-following, agent integration, creative writing, and multilingual use across 100+ languages and dialects. It natively supports a 32K token context window and can extend to. Below you will find 3 current rows with input and output dollars per million. Right now the lowest input is $0.040 and the lowest output is $0.140.
Pricing across providers
Every row is a seller of Qwen3 8B with token pricing we track. The cheapest input in this snapshot is from Llamagate. The bar chart shows the same input and output dollars per million for a quick scan.
| Provider | Input / 1M | Output / 1M | Cached input | Batch |
|---|---|---|---|---|
O Openrouter | $0.050 | $0.400 | — | — |
L Llamagate | $0.040 | $0.140 | — | — |
FA Fireworks AI | $0.200 | $0.200 | — | — |
Input vs output · per provider
Cost calculator
The calculator uses the same dollars per million tokens as the table. Adjust sliders to see how Qwen3 8B cost scales with traffic.
Provider
0.005000¢ / req
0.020000¢ / req
Model specifications
Context length, caps, and capability flags for Qwen3 8B. Family: Qwen. Values follow the main provider (Alibaba) record in our index.
- Context window
- 40,960 tokens
- Max output
- 40,960 tokens
- Vision (images)
- No
- Tool / function calling
- Yes
- Streaming
- No
- Released
- Apr 2025
- Primary provider
- Alibaba
- Model family
- Qwen
Compare Qwen3 8B
Open a pair page to see Qwen3 8B next to another model with a shared provider matrix. 6 shortcuts below.
- Qwen3 8B vs GPT-4o
Compare pricing side by side
- Qwen3 8B vs GPT-4o mini
Compare pricing side by side
- Qwen3 8B vs Claude Sonnet 4.6
Compare pricing side by side
- Qwen3 8B vs Gemini 2.0 Flash
Compare pricing side by side
- Qwen3 8B vs o3
Compare pricing side by side
- Qwen3 8B vs Llama 3.1 70B
Compare pricing side by side
Frequently asked questions
Quick frequently asked items for Qwen3 8B pricing and limits. The short model note from our index: Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks and efficient dialogue. It supports seamless switching between "thinking" mode for math, coding, and...
Also from Alibaba
Other models by Alibaba with live pricing in our catalog.