Loading article…
Transcription accuracy: what WER measures and what it misses
5 min readUpdated May 27, 2026
Questions
Related reading
Speech-to-text providers compared: Whisper, Deepgram, AssemblyAI, Google STT
An honest comparison of the four speech-to-text providers most documentary editors actually use, scored on accent handling, diarization, latency, and cost per hour of audio.
How to transcribe an interview for documentary editing
A practical guide to transcribing interviews for documentary post: when AI is good enough, when human review is needed, how to handle speakers, timecode, and overlap.
Speaker diarization explained: how it works and where it fails
Diarization assigns speaker labels to audio segments without knowing who the speakers are. Here is how voice prints work, why similar vocal profiles cause problems, and how merge thresholds control the output.