OpenAI Batch API JSONL Builder
Build JSONL input files for the OpenAI Batch API from a list of prompts, or validate existing JSONL for spec compliance. The Batch API processes up to 50,000 requests at 50% cost with a 24-hour window — this tool ensures your .jsonl file is correctly formatted before upload.
How to use the OpenAI Batch API JSONL Builder
In Build mode: enter one prompt per line (each line becomes a separate batch request). Select the model, endpoint, and an optional system prompt. Click Generate to produce JSONL — one JSON object per line — ready to upload to the OpenAI Batch API. The custom_id is auto-assigned as request-1, request-2, etc.
In Validate mode: paste an existing JSONL file and the tool checks every line for: valid JSON, presence of custom_id, uniqueness of custom_id, method is POST, URL is present, and body is present. Results are shown per-line in the output area.
The generated JSONL can be directly uploaded via the OpenAI files endpoint (POST /v1/files with purpose=batch), then referenced in a POST /v1/batches call. Download the file and upload it with your preferred HTTP client or the OpenAI SDK.
About the OpenAI Batch API
The OpenAI Batch API accepts a JSONL file containing up to 50,000 individual API requests, processes them asynchronously within 24 hours, and returns results at 50% of the standard API price. It is designed for large-scale inference workloads where latency is not critical: classification of datasets, bulk embeddings, offline summarization, and evaluation runs.
Each line of the input JSONL must be a valid JSON object with four required fields: custom_id (a string you choose, unique within the file), method (must be POST), url (the API endpoint, e.g., /v1/chat/completions), and body (the same object you would send to that endpoint in a normal synchronous call). The custom_id is how you correlate results back to inputs — the output JSONL uses the same field.
Common mistakes that cause batch jobs to fail: duplicate custom_id values, missing required body fields (e.g., model or messages), embedding endpoints used with a chat body, or the JSONL file exceeding 100 MB. This tool validates all of these before you upload.
Common use cases
- Bulk classification — classify thousands of support tickets, product reviews, or documents at half the API cost.
- Dataset embeddings — generate embeddings for entire corpora in a single batch job rather than looping with rate-limit handling.
- Offline evaluation — run LLM-as-judge evaluations over a benchmark dataset without occupying your synchronous API quota.
- Pre-validating uploads — catch JSONL formatting errors locally before wasting an upload slot on a malformed file.
- Prompt format conversion — convert a list of prompts from another format into Batch API JSONL as part of a data pipeline.
Frequently asked questions
What is the file size limit for Batch API uploads?
How do I match output lines back to my input prompts?
custom_id you set in the input. Sort or join on that field to correlate results. This is why unique custom_ids are required.Can I mix chat and embedding requests in one batch?
What happens if some requests fail in a batch?
How long does a batch job take?
GET /v1/batches/{id} for status.