Audio transcription

Audio transcription software built for interview rooms

Upload an interview, get back a clean transcript with speakers separated and timestamps on every line. Drop in a 15-second clip below to see how it reads, then read on for how the full project workflow fits together.

Try it now

A working sandbox. No sign-up, no project. Sample data only.

  • WAV, MP3, M4A, MP4, MOV, MKV and most other common formats accepted.
  • Speaker turns separated automatically; rename one speaker and the change applies project-wide.
  • Click any transcript line to seek the audio player; edits save instantly.
  • Your choice of transcription engine per file. Swap any time to suit the source language.
SAMPLE CLIP · INTERVIEW
demo
0.0s
INTERVIEWER

When did you first realize the bridge wouldn't hold?

How it works

Three steps from raw material to result.

STEP 01
Day 04 · Marin Interview.wav1h 12m · WAV · 94 MB
EngineDeepgram Nova-3 ▾
Drop the file in

Audio or video, up to a few hours per file. The audio is extracted automatically and routed to the transcription engine you picked.

STEP 02
INTERVIEWER00:04:22

Take me back to that night. Did you have any idea what was going to happen?

ELENA M.00:04:31

It wasn't planned. None of it was. We were sitting there… and the lights went out.

Transcript lands aligned to audio

Speaker turns are detected and a voice fingerprint is captured for every voice in the file. Each line is clickable and seeks the waveform.

STEP 03
Rename speaker
EM
Applied across 14 files in this project
Day 04 · Marin Interview
Day 09 · Marin Followup
Day 17 · Ensemble
Edit, name speakers, share

Correct typos inline, name the speakers once, and the renames propagate across every interview where their voice appears.

Frequently asked questions

What audio formats can I upload?

WAV, MP3, M4A, AAC, OGG, FLAC, and most common video containers (MP4, MOV, MKV, WebM). The audio is extracted server-side automatically before transcription runs.

How accurate is the transcript?

Word error rate depends on audio quality and accent. On clean studio interviews in major languages, accuracy lands in the high 90s; on noisy field recordings with strong regional accents, it drops into the 80s. You always get the source audio aligned to the text so you can quickly review and correct.

Can I edit the transcript after it lands?

Yes. The transcript opens in a three-panel editor next to the audio waveform. Click any line to seek the player, edit text inline, rename speakers, or merge identities. Changes save instantly.

Does the transcript include speakers?

Speaker turns are detected automatically. Identities are voice-fingerprinted and matched across every interview in the same project, so once you name a speaker, the change applies everywhere they appear.

How long can a file be?

Single uploads are capped at a few hours of audio per file in the free tier. Longer interviews are usually split into reels during recording; you can upload them as a folder and the project handles them as one session.

Related capabilities

Further reading

Background guides and comparisons.

Put audio transcription to work on your project.

Start free with 5 minutes of AI transcription a month. Or book a personalised walkthrough with the team.