Using Descript to Remove Filler Words from Audio Recordings

Jason Kincaid
Published in
3 min readJun 7, 2018


Some handy tricks to make your audio sound as professional as possible

One of the best things about Descript is the way it makes it easy to remove duplicate and filler words from a recording: just highlight the offending word in the transcript and hit Delete. Descript removes the corresponding audio and automatically crossfades the newly adjoining regions, so the edit sounds smooth.

But sometimes Descript won’t display an “Uh” or “Uhm” in the transcript, because they aren’t recognized as words. In these cases, there are two approaches you can take to cleaning up your audio.

This quick video will walk you through them — and we’ve outlined each step in detail below.

Video Walkthrough

Correct the Transcript

With this approach you’ll insert the filler word into the transcript, so you can delete it just like you would any other word.

  • First make sure you’re in Correcting Text Mode. Then listen back to the recording and figure out where in the transcript the filler word should appear.
  • Once you’ve found the right spot, type in an approximation of the filler (e.g. “Uhm” or “Uhh”). You’ll notice that this text and a few nearby words in the transcript will briefly display a dashed underline — this indicates that Descript is resyncing the text to the audio.
Adding an Um
  • Once the underline goes away a few seconds later, try playing back the audio for this region. You should see that the filler sounds have been mapped to the text you typed.
  • To delete the filler, switch over to Edit Audio Mode (you can use the keyboard shortcut Command + E). Select the filler word in your transcript (notice that the corresponding region is simultaneously selected in the waveform), and press Delete. The filler word will be removed and the newly adjoining regions will be crossfaded.

Delete the Waveform

This approach is a little more advanced, but if you’re comfortable directly selecting waveforms it can be quicker.

  • First, navigate to the region in your transcript where you hear the filler.
  • Look down at the waveform and try to spot where the offending utterance occurs. It’s helpful to use other nearby words as landmarks (you’ll see them at the top of the waveform, in the WordBar).
  • Once you’ve found it, select the region with your mouse, then press fn + Delete. This will execute a Shuffle Delete — which deletes the region you selected, and slides the subsequent audio over to fill in the resulting gap.
Shuffle Deleting a region

Fine-tuning your edits

Many of Descript’s magic moments come when you remove a word, and the resulting edit is seamless (that’s why we built in automatic crossfades!). But sometimes your edit will benefit from finessing — and Descript gives you powerful tools to work with.

  • Adjust the timing. You can drag words in the WordBar to give them more space — or to remove pauses. You can also drag the edge of any audio region to contract or extend it.
  • Tweak the crossfade. Descript lets you manually adjust the shape and duration of your crossfades. These can make a big difference, so don’t be afraid to experiment with them!
Adjusting Crossfades