VideoText workflow guide

Scribie Format Guide: Timestamps, Speaker Labels & Full Verbatim

A practical formatting reference for Scribie transcribers. Covers the mandatory [MM:SS] timestamp format, speaker label syntax without brackets, full verbatim requirements, and every notation you'll need.

Apply Scribie Format formatting rules before delivery Compare workflow capacity

Apply Scribie Format formatting rules before delivery

Why transcript formatting affects QA acceptance

The five most common transcript QA rejection triggers are: inconsistent speaker label format across the file, wrong verbatim level (clean applied where full was required), missing or incorrectly notated inaudible sections, timestamp placement errors, and paragraph length violations. A formatting pass that checks all five before delivery eliminates most revision cycles.
Style guide conflicts between platforms create real operational confusion: Rev's rules for handling crosstalk differ from GoTranscript's, and a transcript formatted correctly for one platform fails QA on the other. Knowing which standard applies before formatting begins saves the rework.
Speaker label drift — where the same speaker gets labeled "JOHN SMITH", "John", and "Speaker 1" in the same document — is the formatting error that takes the longest to correct manually and the most common reason agency QA reviewers send files back.

From raw transcript to client-ready formatted file

1. Select and configure the target style guide

Choose Rev, GoTranscript, TranscribeMe, Scribie, or a custom client specification. Each guideline has distinct rules for verbatim level, speaker label format, timestamp intervals, inaudible notation, and maximum paragraph length.

2. Set verbatim level explicitly

Clean verbatim removes fillers, false starts, and repetitions for readability. Full verbatim preserves all spoken content. Applying the wrong level is the most common single reason transcripts fail marketplace QA — it cannot be fixed without re-reading the source audio.

3. Normalize speaker labels throughout the file

A single speaker must have exactly one label format from the first occurrence to the last. Mixed formats (JOHN SMITH / John / J. Smith) require a find-and-replace pass across the full document before any other formatting work.

4. Apply timestamp rules and paragraph breaks

Rev-style: timestamps every 2 minutes or at each speaker change. GoTranscript: no timestamp requirement by default. TranscribeMe: per-speaker-turn timestamps. Paragraph length limits range from 8 lines (Rev) to no limit (some custom clients).

5. Run pre-delivery QA check

Verify inaudible notation consistency (all [inaudible] or all [INAUDIBLE], never mixed), check bracket format for crosstalk sections, confirm verbatim level is consistent throughout, and validate that paragraph length does not exceed the client's maximum.

Clean, verbatim, and platform-ready transcript outputs

Rev-style formatted transcript

Clean verbatim text. Speaker names in CAPITAL LETTERS followed by a colon. Timestamps in [HH:MM:SS] format at 2-minute intervals and at speaker changes. No paragraph longer than 8 lines. Inaudible sections marked [inaudible]. False starts and fillers removed.

GoTranscript QA-ready file

Optional full or clean verbatim per client request. Speaker labels in "Speaker Name:" format with consistent capitalization. [inaudible] notation for unclear audio. No timestamp requirement unless client specifies. Paragraph breaks at topic shifts rather than fixed intervals.

Client-ready DOCX handoff

Structured document with consistent heading-style speaker labels, paragraph breaks, correctly formatted timestamps, and clean or full verbatim as specified. Formatted for direct delivery — no manual cleanup required before attaching to the client email.

Transcriptionists and editors running formatting workflows

Freelance transcriptionists

Reduce revision risk before submitting marketplace jobs. A pre-delivery formatting check catches the label inconsistency, timestamp format error, or verbatim level mismatch that triggers a QA rejection and unpaid revision.

Agency QA leads and editors

Create a consistent formatting baseline across a team of transcriptionists working on the same client account — so reviewers spend time on content accuracy, not fixing label formats.

Researchers and journalists

Turn raw AI-generated transcripts into readable interview documents: speaker structure, consistent punctuation, accurate paragraph breaks, and timestamps that let readers verify context in the source recording.

Common QA rejection patterns and formatting errors

Speaker label drift example

Page 1: "JOHN SMITH: Thanks for joining us." Page 3: "John: So as I was saying..." Page 7: "Speaker 1: Right, exactly." — All three refer to the same speaker. Label drift of this kind fails QA at every major transcript marketplace.

Timestamp format inconsistency

[00:05:12] on page 2, (00:05:12) on page 4, [5:12] on page 6, and 00:05:12 on page 8 — all in the same document. Most style guides require a single format, and inconsistency triggers automatic rejection on automated QA systems.

Inaudible notation mismatch

"[inaudible]", "[INAUDIBLE]", "[unclear]", and "[crosstalk]" appearing in the same file for the same type of audio problem. Rev requires "[inaudible]" in lowercase brackets. GoTranscript uses a different notation. Neither accepts a mix of styles.

Verbatim level inconsistency

Pages 1–4 are clean verbatim (fillers removed). Pages 5–8 switch to full verbatim (fillers preserved). This happens when an ASR tool applies different post-processing across transcript segments — and it fails QA because the verbatim level is supposed to be consistent throughout.

Style guide differences across platforms

Rev style guide rules

Clean verbatim by default. Speaker names in ALL CAPS followed by a colon. Timestamp format [HH:MM:SS] every 2 minutes and at speaker changes. Maximum paragraph: 8 lines. Inaudible: [inaudible]. Crosstalk: [crosstalk]. False starts and filler words removed.

GoTranscript style guide rules

Clean or full verbatim per client request. Speaker format "Speaker Name:" with title case. [inaudible] for unclear audio. No timestamp requirement by default (client can request). No strict paragraph length limit. Crosstalk marked with [crosstalk].

TranscribeMe style guide rules

Strict verbatim — all fillers and false starts preserved. Timestamps required at each speaker turn. Speaker format "SPEAKER NAME:" in caps. Specific bracket format for technical notation. Paragraph breaks only at speaker changes.

Custom client formatting conflicts

Many agencies specify hybrid rules that don't map cleanly to any standard guide — for example, Rev-style speaker labels but GoTranscript-style verbatim and no timestamps. These combinations must be documented explicitly because the formatter cannot infer them.

Transcript formatting and style guide questions

What is the exact paragraph format for Scribie?

Each Scribie paragraph starts with [MM:SS] Speaker N: — timestamp in square brackets, then speaker label with colon, then the text. New paragraph for every speaker change and for every 4–6 lines within a single speaker turn. Maximum 4–6 lines per paragraph.

How should inaudible audio be marked in Scribie?

Use [inaudible] — square brackets, lowercase, no timestamp inside the notation itself (unlike GoTranscript). Place it exactly where the inaudible word(s) occur. The paragraph-level timestamp tells the client approximately when it happens.

Can I switch between verbatim levels within a single Scribie job?

No. The verbatim level is set per job by the client. If the job is full verbatim, all text must be full verbatim. If it's non-verbatim (clean), all text must be clean. You cannot mix approaches within a single transcript.

How are false starts formatted in Scribie full verbatim?

In Scribie full verbatim, false starts are included with an em dash: "I — I was thinking about" or with a comma: "I, I was thinking about." Both are acceptable. The key is capturing the false start exactly as it occurred in the audio.

How does Scribie handle music or background noise?

Use [music], [background noise], [applause], or [laughter] — square brackets, lowercase. Include sound effects when they are significant to understanding the content. Minor consistent background noise is not marked for every instance, only when it first appears or changes significantly.