1. Configure before processing
Set spoken language explicitly — auto-detect is less accurate, particularly for accented English and mixed-language recordings. Set speaker count if known. Select verbatim level (clean or full) based on what the client or platform requires. Wrong verbatim level is the most common reason transcripts fail QA.
2. Process the recording and monitor for segment artifacts
For recordings over 30 minutes, check for chunking artifacts at segment boundaries — orphaned words at the end of one chunk and duplicated content at the start of the next. These occur when audio segmentation splits at an ambiguous speech boundary.
3. Review and rename speaker labels
Replace "Speaker 1" / "Speaker 2" labels with actual names before running any formatting pass. Speaker label changes must propagate consistently from first occurrence to last — any inconsistency requires another find-and-replace pass later.
4. Apply style-guide formatting
If delivering to a client with a specific style guide (Rev, GoTranscript, TranscribeMe, or custom), apply timestamp formatting, paragraph length rules, and inaudible notation conventions at this stage — before exporting, not after.
5. Export in the required delivery format
DOCX for client review and tracked-changes editing. PDF for locked delivery. TXT for plain-text integrations. SRT/VTT for caption workflows. JSON for search indexing or CMS integration. Each format has different timestamp and structure behaviors.