DeepSeek-R1 vs Qwen3-235B: Detailed Comparison

Choosing between DeepSeek-R1 (DeepSeek) and Qwen3-235B (Alibaba) comes down to three things: per-token pricing, context window, and which capability matters most for your workload. DeepSeek-R1 costs $0.55/M input vs $0.50/M for Qwen3-235B; context windows are 128K vs 128K tokens. Detailed breakdown below.

Side-by-side specs

Spec	DeepSeek-R1	Qwen3-235B
Provider	DeepSeek	Alibaba
Released	2025-01-20	2025-04-29
Input price	$0.55/M	$0.50/M
Output price	$2.19/M	$2.00/M
Cached input	$0.1400/M	—
Context window	128K	128K
Max output	32K	8K
Modalities	text	text
Tokenizer	`deepseek`	`qwen`

Capability matrix

Capability	DeepSeek-R1	Qwen3-235B
json mode	Yes	Yes
streaming	Yes	Yes
reasoning	Yes	No
function calling	No	Yes
thinking	No	Yes
tool use	No	Yes

Benchmark comparison

Higher is better for all benchmarks shown.

Benchmark	Category	DeepSeek-R1	Qwen3-235B	Δ
GPQA Diamond	reasoning	71.5	—	—
MATH	math	97.3	—	—
AIME 2025	math	79.8	—	—

Per-call cost on typical workloads

Workload (in/out tokens)	DeepSeek-R1	Qwen3-235B	Cheaper by
Standard chat (1K / 500)	$0.001645	$0.001500	Qwen3-235B by $0.000145
RAG (4K / 500)	$0.003295	$0.003000	Qwen3-235B by $0.000295
Long doc (20K / 1K)	$0.013190	$0.012000	Qwen3-235B by $0.001190
Very long context (100K / 2K)	$0.058285	$0.053000	Qwen3-235B by $0.005285

When to choose DeepSeek-R1 over Qwen3-235B

Supports reasoning — Qwen3-235B does not.

When to choose Qwen3-235B over DeepSeek-R1

Per-token input cost is 9% lower than DeepSeek-R1.
Supports function calling — DeepSeek-R1 does not.
Supports thinking — DeepSeek-R1 does not.
Supports tool use — DeepSeek-R1 does not.