VideoText workflow guide

TranscribeMe Format Guide: Speaker Labels, Timestamps & Paragraph Rules

Detailed formatting specifications for TranscribeMe transcribers. Covers speaker label syntax, timestamp format, paragraph structure, number formatting, and punctuation rules — with examples for each.

Apply Transcribeme Format formatting rules before delivery Compare workflow capacity

Apply Transcribeme Format formatting rules before delivery

Why transcript formatting affects QA acceptance

The five most common transcript QA rejection triggers are: inconsistent speaker label format across the file, wrong verbatim level (clean applied where full was required), missing or incorrectly notated inaudible sections, timestamp placement errors, and paragraph length violations. A formatting pass that checks all five before delivery eliminates most revision cycles.
Style guide conflicts between platforms create real operational confusion: Rev's rules for handling crosstalk differ from GoTranscript's, and a transcript formatted correctly for one platform fails QA on the other. Knowing which standard applies before formatting begins saves the rework.
Speaker label drift — where the same speaker gets labeled "JOHN SMITH", "John", and "Speaker 1" in the same document — is the formatting error that takes the longest to correct manually and the most common reason agency QA reviewers send files back.

From raw transcript to client-ready formatted file

1. Select and configure the target style guide

Choose Rev, GoTranscript, TranscribeMe, Scribie, or a custom client specification. Each guideline has distinct rules for verbatim level, speaker label format, timestamp intervals, inaudible notation, and maximum paragraph length.

2. Set verbatim level explicitly

Clean verbatim removes fillers, false starts, and repetitions for readability. Full verbatim preserves all spoken content. Applying the wrong level is the most common single reason transcripts fail marketplace QA — it cannot be fixed without re-reading the source audio.

3. Normalize speaker labels throughout the file

A single speaker must have exactly one label format from the first occurrence to the last. Mixed formats (JOHN SMITH / John / J. Smith) require a find-and-replace pass across the full document before any other formatting work.

4. Apply timestamp rules and paragraph breaks

Rev-style: timestamps every 2 minutes or at each speaker change. GoTranscript: no timestamp requirement by default. TranscribeMe: per-speaker-turn timestamps. Paragraph length limits range from 8 lines (Rev) to no limit (some custom clients).

5. Run pre-delivery QA check

Verify inaudible notation consistency (all [inaudible] or all [INAUDIBLE], never mixed), check bracket format for crosstalk sections, confirm verbatim level is consistent throughout, and validate that paragraph length does not exceed the client's maximum.

Clean, verbatim, and platform-ready transcript outputs

Rev-style formatted transcript

Clean verbatim text. Speaker names in CAPITAL LETTERS followed by a colon. Timestamps in [HH:MM:SS] format at 2-minute intervals and at speaker changes. No paragraph longer than 8 lines. Inaudible sections marked [inaudible]. False starts and fillers removed.

GoTranscript QA-ready file

Optional full or clean verbatim per client request. Speaker labels in "Speaker Name:" format with consistent capitalization. [inaudible] notation for unclear audio. No timestamp requirement unless client specifies. Paragraph breaks at topic shifts rather than fixed intervals.

Client-ready DOCX handoff

Structured document with consistent heading-style speaker labels, paragraph breaks, correctly formatted timestamps, and clean or full verbatim as specified. Formatted for direct delivery — no manual cleanup required before attaching to the client email.

Transcriptionists and editors running formatting workflows

Freelance transcriptionists

Reduce revision risk before submitting marketplace jobs. A pre-delivery formatting check catches the label inconsistency, timestamp format error, or verbatim level mismatch that triggers a QA rejection and unpaid revision.

Agency QA leads and editors

Create a consistent formatting baseline across a team of transcriptionists working on the same client account — so reviewers spend time on content accuracy, not fixing label formats.

Researchers and journalists

Turn raw AI-generated transcripts into readable interview documents: speaker structure, consistent punctuation, accurate paragraph breaks, and timestamps that let readers verify context in the source recording.

Common QA rejection patterns and formatting errors

Speaker label drift example

Page 1: "JOHN SMITH: Thanks for joining us." Page 3: "John: So as I was saying..." Page 7: "Speaker 1: Right, exactly." — All three refer to the same speaker. Label drift of this kind fails QA at every major transcript marketplace.

Timestamp format inconsistency

[00:05:12] on page 2, (00:05:12) on page 4, [5:12] on page 6, and 00:05:12 on page 8 — all in the same document. Most style guides require a single format, and inconsistency triggers automatic rejection on automated QA systems.

Inaudible notation mismatch

"[inaudible]", "[INAUDIBLE]", "[unclear]", and "[crosstalk]" appearing in the same file for the same type of audio problem. Rev requires "[inaudible]" in lowercase brackets. GoTranscript uses a different notation. Neither accepts a mix of styles.

Verbatim level inconsistency

Pages 1–4 are clean verbatim (fillers removed). Pages 5–8 switch to full verbatim (fillers preserved). This happens when an ASR tool applies different post-processing across transcript segments — and it fails QA because the verbatim level is supposed to be consistent throughout.

Style guide differences across platforms

Rev style guide rules

Clean verbatim by default. Speaker names in ALL CAPS followed by a colon. Timestamp format [HH:MM:SS] every 2 minutes and at speaker changes. Maximum paragraph: 8 lines. Inaudible: [inaudible]. Crosstalk: [crosstalk]. False starts and filler words removed.

GoTranscript style guide rules

Clean or full verbatim per client request. Speaker format "Speaker Name:" with title case. [inaudible] for unclear audio. No timestamp requirement by default (client can request). No strict paragraph length limit. Crosstalk marked with [crosstalk].

TranscribeMe style guide rules

Strict verbatim — all fillers and false starts preserved. Timestamps required at each speaker turn. Speaker format "SPEAKER NAME:" in caps. Specific bracket format for technical notation. Paragraph breaks only at speaker changes.

Custom client formatting conflicts

Many agencies specify hybrid rules that don't map cleanly to any standard guide — for example, Rev-style speaker labels but GoTranscript-style verbatim and no timestamps. These combinations must be documented explicitly because the formatter cannot infer them.

Transcript formatting and style guide questions

How are paragraphs structured in TranscribeMe?

New paragraph for each speaker change. Within a single speaker's long turn, break paragraphs every 4–6 sentences or approximately 300 characters. Avoid very short paragraphs (single sentences) and very long ones (more than 8 sentences). Balanced paragraphs improve readability and QA scores.

What is the correct TranscribeMe speaker label format?

[Name]: — square brackets around the name, colon after the closing bracket, then the speaker's text. For unknown speakers: [Speaker 1]:, [Speaker 2]:. Example: [Dr. Smith]: The results were encouraging.

When should I use an em dash vs. ellipsis in TranscribeMe?

Use an em dash (—) when a speaker is abruptly cut off or interrupted. Use an ellipsis (...) when a speaker trails off without completing their thought. Both are acceptable in TranscribeMe transcripts; the choice depends on what actually happened in the audio.

How does TranscribeMe format monetary amounts?

Monetary amounts use symbols and numerals: $50, €100, £250. Spell out informal references: "fifty bucks", "a hundred euros". Large round amounts: $5 million (not $5,000,000 unless precision matters).

How does TranscribeMe handle foreign words in English audio?

Transcribe foreign words as spoken in the audio. If you are uncertain of the spelling, use your best judgment and mark with [?]: "The concept of schadenfreude [?] applies here." Do not translate foreign words — transcribe what was said.