AlibabaQwenTool use

Qwen3 8B pricing

Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks and efficient dialogue. It supports seamless switching between "thinking" mode for math, coding, and logical inference, and "non-thinking" mode for general conversation. The model is fine-tuned for instruction-following, agent integration, creative writing, and multilingual use across 100+ languages and dialects. It natively supports a 32K token context window and can extend to. Below you will find 3 current rows with input and output dollars per million. Right now the lowest input is $0.040 and the lowest output is $0.140.

41K context·3 providers·verified Apr 7, 2026
Best input$0.040per 1M tokens · Llamagate
Best output$0.140per 1M tokens · Llamagate

Pricing across providers

Every row is a seller of Qwen3 8B with token pricing we track. The cheapest input in this snapshot is from Llamagate. The bar chart shows the same input and output dollars per million for a quick scan.

O
Openrouter
Input / 1M
$0.050
Output / 1M
$0.400
L
Llamagate
Input / 1M
$0.040
Output / 1M
$0.140
FA
Fireworks AI
Input / 1M
$0.200
Output / 1M
$0.200

Input vs output · per provider

Cost calculator

The calculator uses the same dollars per million tokens as the table. Adjust sliders to see how Qwen3 8B cost scales with traffic.

Provider

In: $0.050/M·Out: $0.400/M

0.005000¢ / req

0.020000¢ / req

Daily
$2.50
Monthly
$75
Annual
$913

Model specifications

Context length, caps, and capability flags for Qwen3 8B. Family: Qwen. Values follow the main provider (Alibaba) record in our index.

Context window
40,960 tokens
Max output
40,960 tokens
Vision (images)
No
Tool / function calling
Yes
Streaming
No
Released
Apr 2025
Primary provider
Alibaba
Model family
Qwen

Compare Qwen3 8B

Open a pair page to see Qwen3 8B next to another model with a shared provider matrix. 6 shortcuts below.

Frequently asked questions

Quick frequently asked items for Qwen3 8B pricing and limits. The short model note from our index: Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks and efficient dialogue. It supports seamless switching between "thinking" mode for math, coding, and...

Yes. Qwen3 8B is available on Openrouter, Llamagate, Fireworks AI.

Also from Alibaba

Other models by Alibaba with live pricing in our catalog.