Chat Template Formatter

Wrap a conversation into the exact prompt string a local model expects. Pick a template family (ChatML, Llama 3, Mistral, Gemma, Phi-3, Alpaca, Vicuna) and see every special token in place, ending with the assistant generation prefix. Useful when a raw /completion call returns garbage because the prompt was not formatted the way the model was trained.

How to use the Chat Template Formatter

Type an optional system message, then the conversation with one turn per line (each line begins user: or assistant:). Pick the template family your model was trained with and press Format prompt. The output is the literal string — special tokens and all — that you would send to a raw text-completion endpoint, ending with the assistant's generation prefix so the model continues as the assistant.

If you call a chat endpoint (/chat/completions, Ollama's chat API) the server applies the template for you, so you do not need this. You need it when you call a raw /completion endpoint, build prompts for fine-tuning, or debug why a base model ignores your instructions.

Why prompt templates matter

Instruction-tuned models are trained on text wrapped in a specific format with special tokens that mark role boundaries — for example <|im_start|> in ChatML or <|start_header_id|> in Llama 3. The model learns to respond only after the assistant marker. Send it a conversation in the wrong format and quality collapses: it may continue your user turn, ignore the system message, or never stop.

The families differ in real ways. Gemma has no system role at all (a system prompt is folded into the first user turn). Mistral wraps user turns in [INST] ... [/INST] and has no explicit assistant token. Alpaca and Vicuna are plain-text templates with no special tokens, common on older finetunes. Matching the template the model was actually trained on is the single most common fix for 'my local model gives bad answers.'

Common use cases

  • Debugging raw completions — when a /completion call returns nonsense, format the prompt here and compare with what you sent.
  • Fine-tuning data prep — produce training strings in the exact template the base model uses.
  • Learning a new model — see at a glance which special tokens and roles a family expects.
  • Porting prompts — move a conversation from one model family to another and see what changes.

Frequently asked questions

Which template should I pick?

Use the one the model was trained with — check the model card. Qwen and many finetunes use ChatML; Llama 3.x uses its own header format; Mistral uses [INST]; Gemma uses its turn format with no system role.

Why does Gemma ignore my system message?

Gemma has no system role. Its template folds any system instruction into the first user turn, which this tool does automatically when you pick Gemma.

Do I need this for the OpenAI or Ollama chat API?

No. Chat endpoints apply the template server-side from your messages array. You need manual formatting only for raw completion endpoints, fine-tuning data, or debugging.

Is anything uploaded?

No. Formatting runs entirely in your browser; your conversation never leaves your machine.
Embed this tool on your site

Free to embed, no attribution required (but appreciated). Paste this where you want the tool to appear: