Grok Token Counter

Estimate tokens and cost for xAI's Grok models. Grok uses a byte-level BPE tokenizer with a vocabulary of about 131,072 tokens and serves a large 131,072-token context window. Paste text, choose a model, and the token count, context-window usage, and per-call cost on the xAI API update live. Everything runs in your browser; nothing is uploaded.

Default prices are representative xAI API list rates; confirm current pricing in the xAI console and use the API's usage field for exact billing.

How to use the Grok Token Counter

Paste your text and pick a Grok model. The token estimate, characters-per-token ratio, and context-window usage update as you type. Set your expected output length and monthly call volume to see the per-call and per-month cost, and override the price fields to match your current xAI rate.

Grok 2 is the flagship general model, Grok 2 mini is the cheaper high-volume option, and Grok Beta was the earlier preview endpoint. All share the same tokenizer and 131K context window, so the token count is identical across them — only price differs.

The Grok tokenizer and context window

Grok uses a byte-level byte-pair-encoding tokenizer with a vocabulary of around 131,072 tokens. That is roughly the same size as Llama 3's 128K vocabulary and far larger than the 32K vocabularies of older open models. A vocabulary that size maps many whole words and common word-pieces to single tokens, so for English the practical ratio lands near 3.8–4.0 characters per token, which is what this estimator uses. As with any BPE tokenizer the exact count is content-dependent — code, JSON, and non-Latin scripts tokenize at different rates than plain prose — so treat the figure as a close estimate rather than an exact bill.

Grok models serve a 131,072-token context window, shared between the prompt you send and the output the model generates. That is comfortably large for long documents and chat histories, but it is still finite, so this tool reports what fraction of the window your input fills — the quickest way to tell whether a long prompt plus its expected answer will fit before you send the request.

Because xAI does not publish a standalone tokenizer for offline use, an exact count requires calling the API and reading the usage field it returns. This heuristic is calibrated to give you a planning number that is within a few percent for typical English text.

Common use cases

  • Costing xAI API calls. See the per-call and monthly cost of a prompt on each Grok tier.
  • Choosing a tier. Compare how the same prompt's cost changes between Grok 2 and Grok 2 mini.
  • Fitting the context window. Confirm a long input fits the 131K window before sending it.
  • Budget planning. Project monthly spend from an expected call volume and output length.

Frequently asked questions

How big is the Grok vocabulary?

Grok's BPE tokenizer uses a vocabulary of about 131,072 tokens, similar in size to Llama 3's. A larger vocabulary maps more words and word-pieces to single tokens, keeping the English ratio near 3.9 characters per token.

Is the token count exact?

No — it is a heuristic estimate calibrated to roughly 3.9 characters per token for English, accurate to within a few percent. xAI does not publish an offline tokenizer, so for exact counts and billing you should read the usage field the API returns with each response.

What context window does Grok support?

Grok models serve a 131,072-token context window, shared between your input and the model's output. This tool shows what fraction of that window your input occupies so you can judge whether a long prompt and its answer will fit.

Are the prices accurate?

The defaults are representative xAI API list rates and are fully editable. Pricing changes over time and by model, so confirm the current rate in the xAI console and edit the fields to match before relying on the cost estimate.

Why is the count the same across Grok models?

Because Grok 2, Grok 2 mini, and Grok Beta all share the same tokenizer. The token count for a given text is therefore identical; only the price per million tokens and the model's quality differ.