What WER is good enough to cut from?

Around 92 to 93 percent real-world accuracy on the specific material. Below that threshold the time saved by transcription starts to disappear as the editor spends it on corrections.

Why do two providers disagree on the same clip?

Different training datasets, different architectures, different preprocessing. Where two engines disagree on a word, the audio is probably at the edge of what either can reliably decode. A human review is worth running on those segments.

Does providing a word list help accuracy?

Yes, for proper nouns and technical terms. Most providers accept a custom vocabulary list that weights those terms higher in the decoding. It substantially reduces substitution errors on the high-value words.

Transcription accuracy: what WER measures and what it misses

5 min readUpdated May 27, 2026

Loading article…

Questions

TRY IT IN PAPERCUTS

Transcribe interviews with PaperCuts

See the feature Create a free account

Questions

Transcribe interviews with PaperCuts

Related reading