WebVTT Viewer

Paste a .vtt WebVTT caption file and read it as a clean table instead of scrolling raw cue blocks. The viewer checks for the required WEBVTT header, parses each cue into id, start, end, duration, settings and text, surfaces NOTE comments and STYLE blocks, totals the runtime, and warns about overlapping cues. Cue settings like line:, position: and align: are shown per cue. Nothing is uploaded — the file is parsed in your browser, so private or unpublished captions never leave your device.

How to use the WebVTT Viewer

Paste WebVTT text into the box or load a .vtt file. The viewer first confirms the file opens with the mandatory WEBVTT signature, then walks the rest of the file block by block. Cues are parsed from their timing line HH:MM:SS.mmm --> HH:MM:SS.mmm, with any optional id line above it captured and any cue settings after the timestamps pulled into their own column. A summary line reports the cue count and total runtime, and the table lists every cue with its computed duration.

Beyond cues, WebVTT files carry structural blocks the viewer surfaces separately. NOTE blocks are comments for the file's authors and are listed so you can read them without hunting through the text. STYLE blocks hold the embedded CSS that colours and positions cues, and any REGION definitions are noted too. The notice area flags overlapping cues, where one caption is still showing when the next begins, which is normal for some layouts but often a timing slip. Cue text, settings and any inline tags such as <c>, <v Speaker> or <i> are displayed verbatim and safely escaped, so markup shows as text rather than rendering.

What WebVTT is, and how it differs from SRT

WebVTT — the Web Video Text Tracks format — is the W3C standard for timed text on the web. It is the format the HTML5 <track> element expects, so when a <video> shows captions, subtitles, chapters or descriptions in a browser, those cues almost always come from a .vtt file. It was designed as a deliberate evolution of SRT: familiar enough that anyone who has edited subtitles will recognise it, but extended with the structure a styling-capable, accessibility-focused web platform needs.

The differences from SRT are small on the page but significant in practice. Every WebVTT file must begin with the literal string WEBVTT on the first line, which is how browsers identify it; without that header the track is rejected. Timecodes use a dot before the milliseconds (00:01:23.456) rather than SRT's comma. Cue identifiers are optional rather than the mandatory sequence numbers of SRT, and they can be any text, not just numbers. After the end timestamp a cue may carry cue settingsline:, position:, size:, align: and vertical: — that control where and how the caption appears. Files can include STYLE blocks of CSS that target cues with selectors like ::cue, NOTE comment blocks, and REGION definitions for scrolling roll-up areas. Inline tags go further too, adding voice tags (<v Bob>), class tags (<c.loud>) and timestamp tags for karaoke-style highlighting.

It also helps to be precise about captions versus subtitles, a distinction WebVTT makes explicit through the kind attribute on the <track> element. Subtitles assume the viewer can hear the audio and only need the dialogue translated or transcribed. Captions assume the viewer cannot hear, so they include non-speech information — speaker labels, sound effects, music cues — and are an accessibility requirement in many jurisdictions. WebVTT supports both, along with chapters for navigation and descriptions for audio description text. Converting an SRT file to WebVTT is mostly mechanical — add the WEBVTT header, change commas to dots — but the reverse loses any settings, styling or voice tags that SRT cannot represent.

Common use cases

  • Debugging HTML5 captions. Confirm a .vtt file parses and has the WEBVTT header before wiring it to a <track> element.
  • Reading cue settings. See the position and alignment applied to each cue without decoding the syntax by eye.
  • Finding overlaps. Spot cues that collide on screen, a common cause of captions stacking or flickering.
  • Inspecting styling and notes. Pull out STYLE blocks and NOTE comments to understand how a file is meant to render.
  • Private review. Check unpublished captions locally instead of uploading them to an online tool.

Frequently asked questions

Why must a WebVTT file start with WEBVTT?

The WEBVTT line is the format signature. Browsers and parsers use it to confirm the file is WebVTT before reading any cues, and a file without it is rejected by the HTML5 track element. The viewer checks for it and warns if it is missing.

How is WebVTT different from SRT?

WebVTT requires the WEBVTT header, uses a dot before the milliseconds instead of a comma, makes cue identifiers optional, and adds cue settings, STYLE blocks, NOTE comments and regions. SRT has none of those, which is why converting SRT to WebVTT mainly means adding the header and swapping commas for dots.

What are cue settings?

They are optional instructions placed after the end timestamp on a cue line, such as line:0, position:50%, align:start or vertical:rl. They control where on the video the caption is drawn and how it is aligned. The viewer shows them in their own column so you can review them per cue.

What is the difference between captions and subtitles?

Subtitles assume the viewer can hear and usually just transcribe or translate dialogue. Captions assume the viewer cannot hear and add speaker labels, sound effects and music cues, which makes them an accessibility feature. WebVTT supports both, selected by the kind attribute on the track element.

Is my caption file uploaded anywhere?

No. The .vtt file is read and parsed entirely in your browser with client-side JavaScript. Nothing is transmitted, so unpublished or confidential captions never leave your device.