Overcoming Writer’s Block with Automatic Transcription

If you’re a writer — of books, essays, scripts, blog posts, whatever — you’re familiar with the phenomenon: the blank screen, a looming deadline, and a sinking feeling in your gut that pairs poorly with the jug of coffee you drank earlier.

If you know that rumble all too well: this post is for you. Maybe it’ll help you get out of a rut; at the very least, it’s good for a few minutes of procrastination.

Here’s the core idea: thinking out loud is often less arduous than writing. And it’s now easier than ever to combine the two, thanks to recent advances in speech recognition technology.

Of course, dictation is nothing new — and plenty of writers have taken advantage of it. Carl Sagan’s voluminous output was facilitated by his process of speaking into an audio recorder, to be transcribed later by an assistant (you can listen to some of his dictations in the Library of Congress!) And software like Dragon’s Naturally Speaking has offered automated transcription for people with the patience and budget to pursue it.

But it’s only in the last couple of years that automated transcription has reached a sweet spot — of convenience, affordability and accuracy—that makes it practical to use it more casually. And I’ve found it increasingly useful for generating a sort of proto-first draft: an alternative approach to the painful process of converting the nebulous wisps inside your head into something you can actually work with.

I call this process idea extraction (though these ideas may be more accurately dubbed brain droppings). Either way, they make for fertile creative soil — and I’m not the only one using transcription this way:

Part I: Extraction

Here’s how my process works. Borrow what works for you and forget the rest — and let me know how it goes!

  • Pick a voice recorder; I’m a fan of the Voice Memos app on my iPhone. Start talking. Try it with a topic you’ve been chewing on for weeks — or when an idea flits your head. Don’t overthink it. Just start blabbing.
  • The goal is to tug on as many threads as you come across, and to follow them as far as they go. These threads may lead to meandering tangents— and you may discover new ideas along the way.
  • A lot of those new ideas will probably be embarrassingly bad. That’s fine. You’re already talking about the next thing! And unlike with text, your bad ideas aren’t staring you in the face.
  • Consider leaving comments to yourself as you go — e.g. “Maybe that’d work for the intro”. These will come in handy later.
  • For me, these recordings run anywhere from 20–80 minutes. Sometimes they’re much shorter, in quick succession. Whatever works.

Part II: Transcription

Once I’ve finished recording, it’s time to harness ⚡️The Power of Technology⚡️

A little background: over the last couple of years there’s been an explosion of tools related to automatic speech recognition (ASR) thanks to huge steps forward in the underlying technologies.

Here’s how ASR works: you import your audio into the software, the software uses state-of-the-art machine learning to spit back a text transcript a few minutes later. That transcript won’t be perfect—the robots are currently in the ‘Write drunk’ phase of their careers. But for our purposes that’s fine: you just need it to be accurate enough that you can recognize your ideas.

My favorite tool for this is Descript, which makes it easy to ‘punch in’ to the audio and listen back to moments when the transcription is ambiguous. It also has solid organizational support, so I can create folders of related transcripts and their recordings — and its search feature lets you quickly run keyword queries across every transcript in your Project.

Under good conditions Descript delivers around 95% accuracy, powered by Google’s best-in-class speech recognition (and unlike the automated transcription of yore, there’s no need to train the software to your voice). And if you’re a subscriber ($10/mo) it costs just $4.20 to transcribe an hour of footage.

Importing audio into Descript (this isn’t sped up!)

(You’re reading this on the Descript blog, so I’m biased — but my excitement around this workflow was one of the reasons I joined the company!)

Some other cool Descript perks:

  • You can copy and paste text from multiple transcripts into a single composition, which is helpful for compiling ideas you’ve had around a given topic. Better yet, the app keeps your text synced with its underlying audio — so if you ever want to listen back to the original recording (perhaps the transcription missed a word), it’s one click away.
  • Markers and Highlighters make it easy to mark up your document. Don’t miss the ‘New Composition from Highlights’ feature, which consolidates all the text you’ve highlighted into a new Composition—I call it the ‘good parts version’.
  • Want to type some text inline with your transcript? Just switch to ‘Edit Audio Mode’ in the app and start typing.
  • There’s an audio recording feature built into the app (with one-click transcription), so you can keep riffing at your computer without having to pull out your recorder again.
  • Want to clean up your audio and use it for something else? Descript is a full-fledged audio editor, complete with automatic crossfades.

Once you have your text transcript, your next step is up to you: maybe you’re exporting your transcript as a Word doc and revising from there. Maybe you’re firing up your voice recorder again to dictate a more polished take. Maybe only a few words in your audio journey are worth keeping — but that’s fine too. It probably didn’t cost you much (and good news: the price for this tech will continue to fall in the years ahead).

A few more tips:

  • Use a recorder/app that you trust. Losing a recording is painful — and the anxiety of losing another can derail your most exciting creative moments (“I hope this recorder is working. Good, it is... @#*! where was I?”)
  • Audio quality matters when it comes to automatic transcription. If your recording has a lot of background noise or you’re speaking far away from the mic, the accuracy is going to drop. Consider using earbuds (better yet: Airpods) so you can worry less about where you’re holding the recorder.
  • Find a comfortable space. Eventually you may get used to having people overhear your musings, but it’s a lot easier to let your mind “go for a walk” when you’re comfortable in your environment.
  • Speaking of walking: why not go for a stroll? The pains of writing can have just as much to do with being stationary and hunched over. Walking gets your blood flowing — and your ideas too.
  • I have a lot of ideas, good and bad, while I’m thinking out loud and playing music at the same time (in my case, guitar — but I suspect it applies more broadly). There’s something about playing the same four-chord song on auto pilot for the thousandth time that keeps my hands busy and leaves my mind free to wander.

The old ways of doing things — whether it’s with a keyboard or pen — still have their advantages. Putting words to a page can force a sort of linear thinking that is otherwise difficult to maintain. And when it comes to editing, it’s no contest: QWERTY or bust.

But for getting those first crucial paragraphs down (and maybe a few keystone ideas to build towards)? Consider talking to yourself. Even if you wind up with a transcript full of nothing but profanity — well, have you ever seen a transcript full of profanity? You could do a lot worse.

Want to try out Descript? Download it here, and you’ll get 30 minutes of free transcription.