Qwen3.5-Flash pricing
The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the 3 series, these models deliver a leap forward in performance for both pure text and multimodal tasks, offering fast response times while balancing inference speed and overall performance. Live index: 1 priced offer. Best input $0.100 per million tokens from Openrouter. Best output $0.400 per million tokens from Openrouter.
Pricing across providers
Every row is a seller of Qwen3.5-Flash with token pricing we track. The cheapest input in this snapshot is from Openrouter. The bar chart shows the same input and output dollars per million for a quick scan.
| Provider | Input / 1M | Output / 1M | Cached input | Batch |
|---|---|---|---|---|
O Openrouter | $0.100 | $0.400 | — | — |
Input vs output · 1M tokens
Cost calculator
Pick any provider row and type how many tokens you expect per day, week, or year. We turn that into rough dollar totals for Qwen3.5-Flash.
0.010000¢ / req
0.020000¢ / req
Model specifications
Quick spec sheet for Qwen3.5-Flash before you dive back into pricing. Family: Qwen. Reported under Alibaba.
- Context window
- 1,000,000 tokens
- Max output
- 65,536 tokens
- Vision (images)
- Yes
- Tool / function calling
- Yes
- Streaming
- No
- Released
- Feb 2026
- Primary provider
- Alibaba
- Model family
- Qwen
Compare Qwen3.5-Flash
These links open full side by side pages for Qwen3.5-Flash. We picked pairs that people often shop together. 6 ready to open.
- Qwen3.5-Flash vs GPT-4o
Compare pricing side by side
- Qwen3.5-Flash vs GPT-4o mini
Compare pricing side by side
- Qwen3.5-Flash vs Claude Sonnet 4.6
Compare pricing side by side
- Qwen3.5-Flash vs Gemini 2.0 Flash
Compare pricing side by side
- Qwen3.5-Flash vs o3
Compare pricing side by side
- Qwen3.5-Flash vs Llama 3.1 70B
Compare pricing side by side
Frequently asked questions
Answers pull from the same numbers you see on this page. The short model note from our index: The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to t...
Also from Alibaba
Other models by Alibaba with live pricing in our catalog.