LLM Context Window Comparison
Compare the context windows of today's major language models and see how much text each one can hold. Enter a token or word count and the table shows which models fit it and how much of their window it fills. Useful for deciding whether a long document, codebase, or chat history will fit before you commit to a model. All calculation runs in your browser.
Conversions assume ~4 characters and ~0.75 words per token (English). Context windows are shared between prompt and output, so leave headroom for the response. Figures reflect each model's standard published window.
How to use the LLM Context Window Comparison
Enter how much text you want to send and choose the unit — tokens, words, characters, or pages. The table converts your input to tokens and shows each model's context window, the percentage your input would fill, and whether it fits with room to spare. Sort by window size or name, and tick "only models that fit" to filter the list down to viable choices.
Remember the window is shared between your prompt and the model's reply. A prompt that fills 95% of the window leaves almost no room for output, so treat anything above roughly 80% as a tight fit rather than a comfortable one.
What a context window is — and why it caps your prompt
A model's context window is the maximum number of tokens it can attend to at once — the hard ceiling on prompt plus generated output combined. If your input plus the expected answer exceeds the window, the request is rejected or the oldest tokens are silently dropped, so knowing the limit ahead of time saves a wasted call.
Windows vary enormously across the current generation. Gemini's models lead with one-to-two-million-token windows; most frontier chat models (GPT-4o, Claude, Llama 3, Mistral, Qwen, DeepSeek, Grok) cluster around 128K–200K; and some efficient open models like Gemma 2 stay as low as 8K. A million-token window sounds limitless, but a single large codebase or a long PDF can still consume hundreds of thousands of tokens, and cost and latency both climb with the amount of context you actually use.
Because tokenizers differ, the same text is a slightly different number of tokens on each model — this tool uses the common English approximation of about four characters per token. For an exact count on a specific model, use that model's dedicated token counter or the token count its API returns.
Common use cases
- Picking a model for a long document. See at a glance which models can hold your full input.
- RAG sizing. Check whether your retrieved chunks plus the question fit a target model's window.
- Codebase prompts. Estimate whether a large source tree fits before paying for a long call.
- Comparing upgrades. Weigh a jump from a 128K to a 1M window against the higher cost it brings.