VideoText workflow guide

ADA Video Captions — Generate Captions for Section 508 Compliance

ADA Section 508 and WCAG 2.1 Level AA require synchronised captions for pre-recorded video content on public websites. VideoText generates accurate captions from any video in under 2 minutes. Upload an MP4 or paste a YouTube URL and get a fully synchronised SRT or VTT file ready for upload. Used by universities, government agencies, corporate training teams, and healthcare providers.

Fix Ada Video Captions timing and reading-speed issues Compare workflow capacity

Fix Ada Video Captions timing and reading-speed issues Export Ada Video Captions captions without timestamp drift

Where subtitle workflows break in real production

Reading speed is the invisible subtitle constraint: 14–17 characters per second is the readable range for most viewers; anything above 21 CPS causes comprehension failure even when the transcript text is accurate. Most automated subtitle tools generate lines without CPS checks.
Burned captions create a permanent workflow commitment — any timing correction or text fix after encoding requires re-encoding the full video. For long-form content this is hours of lost render time, which is why burn decisions need to happen after QA, not before.
Platform subtitle requirements are not interchangeable: TikTok displays a 2-line maximum with specific font rendering; YouTube accepts up to 1,500 SRT blocks and requires UTF-8 encoding; Instagram Reels ignores soft subtitle tracks on autoplay entirely, making burned captions the only reliable option for Reels.

From raw video to export-ready subtitle file

1. Generate subtitle file with word-level timestamps

Upload the video or paste a public URL. Word-level timestamp alignment produces more accurate line breaks than sentence-level alignment — each subtitle break falls at a natural pause rather than a word boundary mid-phrase.

2. Review CPS on every subtitle line

Flag any line above 17 CPS and split it. An 8-word subtitle in a 1.2-second window hits approximately 24 CPS — unreadable for most viewers. Merge subtitle lines shorter than 0.8 seconds, which display too briefly to register.

3. Check for timing overlap and minimum duration

Subtitle overlap — where the next block starts before the previous one ends — causes display flicker in most players. Any gap shorter than 0.1 seconds between adjacent subtitles should be merged or widened to prevent rendering artifacts.

4. Select platform-appropriate export format

Export SRT for YouTube, Vimeo, and video editors. Export VTT for HTML5 web players and streaming platforms that support caption positioning. Choose burned-caption output for Instagram Reels, TikTok clips, and any social context where autoplay without sound is expected.

SRT, VTT, and burned caption outputs

SRT subtitle file

Standard timed subtitle format: sequence number, timestamp pair (00:00:00,000 → 00:00:00,000 with comma separator), and one or two lines of caption text per block. Supported by YouTube, Vimeo, LinkedIn, most video editors, and accessibility workflows.

VTT subtitle stream

WebVTT format with period timestamp separators (00:00:00.000 → 00:00:00.000). Used by HTML5 video players, streaming services, and accessibility platforms. Unlike SRT, VTT supports <cue> positioning metadata, text styling, and karaoke-mode highlighting.

Burned captions

Permanently embedded caption text rendered directly into the video frame — cannot be toggled off by the viewer. Required for platforms that strip external caption tracks. Font rendering, shadow depth, and vertical position all need QA review after encoding because video compression can degrade caption legibility.

Creators and teams running subtitle workflows

Video creators publishing to multiple platforms

Generate SRT for YouTube upload, export burned captions for Instagram Reels, and produce VTT for course platform embeds — all from the same subtitle pass without reformatting timing.

Accessibility compliance teams

Produce synchronized captions that meet WCAG 2.1 timing and reading-speed requirements. CPS validation catches lines that fail accessibility guidelines before the video goes live.

Post-production editors

Export SRT or VTT for import into Premiere, DaVinci Resolve, or Final Cut Pro. Accurate word-level timestamps eliminate the need to manually re-sync caption timing after import.

Subtitle edge cases that cause QA failure

CPS violation — splitting required

Eight words spoken in 1.2 seconds: "we need to fix this before the deadline" at 24 CPS. The line must be split into two display segments within the same timing window to hit the readable 14–17 CPS range.

Subtitle timing overlap

Block ending at 00:01:23,500 overlapping a block starting at 00:01:23,200 causes display flicker and rendering failure in most SRT players. A subtitle validator catches this before upload.

Mobile frame crop

A two-line subtitle with 44+ characters per line at default font sizes clips at the bottom of mobile video frames. The safe area for mobile subtitles is a single line, 38 characters maximum, positioned at 85% vertical height.

Burned caption rendering after encode

Font stroke weight, drop shadow depth, and subtitle vertical position all shift slightly after H.264 encoding. Captions that look correct in preview may become harder to read in the encoded file — requires a post-encode QA pass before publishing.

Platform-specific subtitle requirements

YouTube SRT requirements

UTF-8 encoding required. Maximum 1,500 subtitle blocks per file. Timestamp format: 00:00:00,000 (comma separator, not period). Files over 1,500 blocks need splitting before upload. YouTube auto-syncs uploaded SRT to audio — small timing offsets are corrected automatically.

TikTok caption behavior

TikTok generates its own auto-captions and displays them over uploaded SRT files in most cases. Burned captions are the reliable method for ensuring accurate captions appear on TikTok content. SRT upload works but TikTok may override it.

Instagram Reels caption handling

Instagram does not display soft subtitle tracks during autoplay in Feed or Reels. Burned captions are the only reliable method for ensuring captions appear for silent autoplay viewers. Instagram's built-in auto-caption feature can also be used after upload, but accuracy varies.

VTT vs SRT encoding difference

VTT uses period timestamp separators (00:00:00.000) while SRT uses commas (00:00:00,000). Swapping the separator character breaks parsing in most players. VTT files must begin with the WEBVTT header line — SRT files must not.

Subtitle workflow questions answered

What makes video captions ADA-compliant?

ADA-compliant captions must be: (1) accurate — 99%+ word accuracy for broadcast, high accuracy for web video, (2) synchronised — captions must match the audio timing, (3) complete — all spoken words and relevant non-speech audio must be captioned, (4) properly placed — not covering important visual content.

Who needs to caption their videos under the ADA?

Federal agencies (Section 508), state and local governments (ADA Title II), and places of public accommodation — including universities, businesses, and healthcare providers — must caption video content on their public-facing websites. The ADA applies broadly: if your organisation serves the public, video captions are typically required.

Can AI-generated captions meet ADA requirements?

AI-generated captions at 98%+ accuracy (like VideoText's Whisper-powered output) are generally sufficient for most video content. For legal proceedings, medical content, or content where accuracy is critical, human review of the AI output is recommended. VideoText produces an editable transcript alongside the caption file for this purpose.

What format should ADA captions be in?

SRT and VTT files are accepted by all major platforms (YouTube, Vimeo, Kaltura, Panopto, Brightcove). For broadcast and enterprise delivery, SCC (CEA-608/708) or TTML may be required. VideoText exports SRT and VTT.