Codestral vs DeepSeek-V3: Detailed Comparison

Choosing between Codestral (Mistral AI) and DeepSeek-V3 (DeepSeek) comes down to three things: per-token pricing, context window, and which capability matters most for your workload. Codestral costs $0.20/M input vs $0.27/M for DeepSeek-V3; context windows are 32K vs 128K tokens. Detailed breakdown below.

Side-by-side specs

Spec	Codestral	DeepSeek-V3
Provider	Mistral AI	DeepSeek
Released	2024-05-29	2024-12-26
Input price	$0.20/M	$0.27/M
Output price	$0.60/M	$1.10/M
Cached input	—	$0.0700/M
Context window	32K	128K
Max output	4K	8K
Modalities	text	text
Tokenizer	`mistral`	`deepseek`

Capability matrix

Capability	Codestral	DeepSeek-V3
function calling	Yes	Yes
json mode	Yes	Yes
streaming	Yes	Yes
code	Yes	No

Benchmark comparison

Higher is better for all benchmarks shown.

Benchmark	Category	Codestral	DeepSeek-V3	Δ
MMLU-Pro	general	—	75.9	—
HumanEval	coding	—	82.6	—

Per-call cost on typical workloads

Workload (in/out tokens)	Codestral	DeepSeek-V3	Cheaper by
Standard chat (1K / 500)	$0.000500	$0.000820	Codestral by $0.000320
RAG (4K / 500)	$0.001100	$0.001630	Codestral by $0.000530
Long doc (20K / 1K)	$0.004600	$0.006500	Codestral by $0.001900
Very long context (100K / 2K)	$0.020900	$0.028650	Codestral by $0.007750

When to choose Codestral over DeepSeek-V3

Per-token input cost is 26% lower — meaningful for high-volume workloads.
Supports code — DeepSeek-V3 does not.

When to choose DeepSeek-V3 over Codestral

Larger context window (128K vs 32K).