Qwen3 32B pricing
Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode for tasks like math, coding, and logical inference, and a "non-thinking" mode for faster, general-purpose conversation. The model demonstrates strong performance in instruction-following, agent tool use, creative writing, and multilingual tasks across 100+ languages and dialects. It natively handles. Below you will find 7 current rows with input and output dollars per million. Right now the lowest input is $0.080 and the lowest output is $0.230.
Pricing across providers
All figures are list prices per million tokens unless a column says otherwise. 7 offers are listed for Qwen3 32B. Best input in this view: Ovhcloud.
| Provider | Input / 1M | Output / 1M | Cached input | Batch |
|---|---|---|---|---|
N Nebius | $0.100 | $0.300 | — | — |
O Ovhcloud | $0.080 | $0.230 | — | — |
S Sambanova | $0.400 | $0.800 | — | — |
O Openrouter | $0.080 | $0.240 | — | — |
G Groq | $0.290 | $0.590 | — | — |
FA Fireworks AI | $0.900 | $0.900 | — | — |
D DeepInfra | $0.100 | $0.280 | — | — |
Input vs output · per provider
Cost calculator
Use this block to stress test Qwen3 32B cost without a spreadsheet. All estimates come from public list rates in this page.
Provider
0.010000¢ / req
0.015000¢ / req
Model specifications
These fields describe Qwen3 32B as we store it (Family: Qwen. source: Alibaba). They sit next to price so buyers can check limits and tools in one place.
- Context window
- 40,960 tokens
- Max output
- 40,960 tokens
- Vision (images)
- No
- Tool / function calling
- Yes
- Streaming
- No
- Released
- Apr 2025
- Primary provider
- Alibaba
- Model family
- Qwen
Compare Qwen3 32B
Open a pair page to see Qwen3 32B next to another model with a shared provider matrix. 6 shortcuts below.
- Qwen3 32B vs GPT-4o
Compare pricing side by side
- Qwen3 32B vs GPT-4o mini
Compare pricing side by side
- Qwen3 32B vs Claude Sonnet 4.6
Compare pricing side by side
- Qwen3 32B vs Gemini 2.0 Flash
Compare pricing side by side
- Qwen3 32B vs o3
Compare pricing side by side
- Qwen3 32B vs Llama 3.1 70B
Compare pricing side by side
Frequently asked questions
Answers pull from the same numbers you see on this page. The short model note from our index: Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode for tasks like math,...
Also from Alibaba
Other models by Alibaba with live pricing in our catalog.