Fine-tuning JSONL Validator

Catch the errors that make a fine-tuning upload fail before you spend a job on it. Paste your JSONL dataset — one example per line, in either the chat messages format or the legacy prompt/completion format — and every line is checked for valid JSON, a recognised structure, correct roles, and non-empty content. It flags examples with no assistant turn to train on, conversations that start on the wrong role, exact-duplicate lines, and bad weights, then summarises the dataset with an example count, the detected format, a rough token estimate and the role distribution. Everything runs in your browser, so your training data never leaves your machine.

JSONL dataset — one JSON example per line

How to use the Fine-tuning JSONL Validator

Paste your dataset with one JSON object per line — the JSONL format every major fine-tuning API expects. The validator auto-detects whether you're using the modern chat format (each line an object with a messages array of role/content turns) or the legacy prompt/completion format, and checks each line against the right rules. For chat examples it confirms messages is a non-empty array, every turn has a known role (system, developer, user, assistant, tool or function) and some content or tool calls, that there is at least one assistant message to learn from, and that the conversation doesn't open on an assistant turn. For prompt/completion it checks both fields are strings and warns when a completion lacks the conventional leading space.

Each problem is reported against its line number, with errors (which would break the upload) separated from warnings (things worth checking but not fatal). The summary cards give you the totals at a glance: how many examples are valid versus broken, the detected format, a rough token estimate (characters ÷ 4) for budgeting the training cost, the number of exact-duplicate lines, and how the roles are distributed across the set. Because everything is parsed and validated locally, you can safely run a real, private dataset through it — nothing is uploaded — and fix the flagged lines before sending the file to OpenAI, Together, Fireworks or whichever platform you're training on.

Why fine-tuning datasets fail validation

Supervised fine-tuning expects training data as JSONL: a plain-text file with one self-contained JSON example per line. The format is deliberately simple so it streams cheaply, but that simplicity means a single malformed line — a trailing comma, an unescaped quote, a stray blank — can fail the whole upload, and the error a platform returns is often just a line number with little explanation. Validating locally first turns a slow remote round-trip into an instant local check, and surfaces the structural problems that a bare JSON parser won't even notice.

The dominant format today is the chat schema: each line is an object with a messages array, and each message has a role and content, mirroring the conversation format used at inference. The most common mistakes are semantic rather than syntactic. An example with no assistant message gives the model nothing to learn to produce — it's valid JSON but useless for training. A conversation that begins with an assistant turn, a typo'd role like "asistant", an empty content string, or a stray weight value other than 0 or 1 will all either error out or quietly degrade the run. The older prompt/completion format is simpler — two string fields per line — but has its own convention, like the leading space that often belongs at the start of a completion, that's easy to forget.

Beyond per-line correctness, a few dataset-level properties are worth knowing before you train. Duplicates waste compute and can skew the model toward repeated examples; spotting exact-match lines is a cheap way to catch a botched export. A token estimate, even the rough characters-÷-4 approximation, lets you sanity-check the size of the job and its likely cost before committing. And the role distribution is a quick signal that the data is shaped the way you intended — that there are roughly as many assistant turns as you expect, that system prompts appear where they should. None of this replaces the platform's own validation, which applies the exact tokenizer and limits, but catching the obvious failures locally means the version you upload is far more likely to train on the first try.

Common use cases

Pre-upload checks. Catch malformed JSON, bad roles and missing assistant turns before a fine-tuning job fails.
Export debugging. Spot duplicate or empty lines from a botched dataset export.
Cost estimation. Get a rough token count to budget a training run before submitting it.
Format conversion. Confirm a dataset is consistently chat or prompt/completion, not a mix.

Frequently asked questions

Which formats does it validate?

Both common fine-tuning formats: the chat format (each line an object with a "messages" array of role/content turns) and the legacy prompt/completion format (each line with "prompt" and "completion" strings). It auto-detects which you are using from the first valid line and checks the rest against it.

What counts as an error versus a warning?

Errors would break the upload or training — invalid JSON, an unrecognised structure, a missing or unknown role, no assistant message to learn from, or non-string prompt/completion fields. Warnings are things worth reviewing but not fatal: empty content, a conversation starting on an assistant turn, a duplicate line, or a missing leading space in a completion.

How accurate is the token estimate?

It is a rough approximation of roughly four characters per token, which is typical for English text. It is good for ballpark cost and size planning, but the real count depends on the specific tokenizer and language, so treat the platform's own figure as authoritative for billing.

Why does it flag examples with no assistant message?

In chat fine-tuning the model learns to produce the assistant turns. An example with only system and user messages gives it no target to train on, so while the JSON may be valid, the example contributes nothing and most platforms will reject or ignore it.

Is my training data uploaded anywhere?

No. Every line is parsed and validated entirely in your browser. Nothing you paste is sent to a server or stored, so it is safe to validate private or proprietary datasets here.

Embed this tool on your site

Free to embed, no attribution required (but appreciated). Paste this where you want the tool to appear:

<iframe src="https://codeswap.net/llm/finetune-jsonl-validator/?embed=1" width="100%" height="520" loading="lazy" style="border:1px solid #e5e7eb;border-radius:8px" title="Fine-tuning JSONL Validator"></iframe>
<p style="font-size:13px">Tool by <a href="https://codeswap.net/llm/finetune-jsonl-validator/">Fine-tuning JSONL Validator — Codeswap</a></p>