Multi-Model Token Comparison

Same text, every tokenizer. See how GPT-5/4o (o200k_base), GPT-4 / GPT-3.5 (cl100k_base), Claude, and Gemini count the tokens in your input. Plus per-call cost on each model's current list price. Useful when picking a model for multilingual or code-heavy workloads where tokenization differences dominate the per-token-price difference.

Text to compare

Why tokenizer differences matter

The published per-token price isn't the whole cost story. The same text tokenizes differently in different models. A tokenizer that's 20% more efficient on your particular content type — code, non-English, structured data — effectively cuts your price by 20% on top of whatever the published list rate says.

For English prose the spread is small (5-15% between extremes). For Mandarin, Japanese, Korean, Arabic, and Hindi, the spread can be 2-3x — and the most expensive tokenizer per token isn't always the most expensive in practice once tokenization is accounted for. This tool surfaces that with a side-by-side count.

Multi-Model Token Comparison

Why tokenizer differences matter

Related tools