Writing for Audio, Not Articles: Subtle Differences Make All the Difference

Jay Acunzo
Jul 2 · 12 min read
Photo by Nick Morrison

Originally posted to MarketingShowrunners.com — advancing the craft of marketers making original series to build passionate audiences. Join leaders from Red Bull, Mailchimp, LinkedIn, Salesforce, Shopify, Zendesk, and Wistia who get our monthly email.

As anyone who’s had to suffer through another person reading somebody’s bio out loud can attest, there’s a big difference between writing meant to be read silently and writing meant to be spoken aloud. Since 2013, when I launched Traction, a podcast for the venture capital firm I called home, NextView, I’ve been tripping forward, messing around, and considering the subtle but crucial differences between blogging and podcasting. To make your experience a little less painful and a whole lot quicker, here are the most important things to keep in mind when writing for a podcast…

Write to (A) be understood (B) in THIS medium.

In other words, forget your English degree … said the proud English Literature major, lower lip quivering ever so slightly.

Let’s start with this big, broad idea of “writing to be understood,” rather than writing to sound like writing, and then let’s press the idea through the lens of this site specifically to add “in THIS medium.”

When writing for audio, remember that you’ll need to SPEAK those words, while others need to HEAR them. Those are such crucial differences compared to writing for a blog post, social media, or email, and yet we overlook them for two reasons: First, nobody has “talker’s block,” and so we assume that we can just fire up a microphone and speak as we normally do. But that doesn’t honor the medium or the listener experience. They can’t see you talking, which matters a surprising amount. Second, when we write something for our podcasts, we don’t change our writing style to match the medium.

Here are a few ways to ensure we solve this problem:

1. Write and speak in short sentences. Always.

You’d be shocked at just how few compound sentences are easy to read out loud, especially when compared with how easily we read lengthier sentences on a screen (like this one right here). For the record, that last sentence would be terrible scripting for audio.

2. Constantly guide and tease the listener.

Audio lacks any visual indicators and helpers, like headline breaks or even just the words typed out for you to revisit (like you may have done right here). But despite the lack of visual indicators and helpers, audio is a visual medium. I’m not crazy, just go with me for a second: When you hear someone describe something, you start building that something in your mind. When you hear a scratchy, broken, male voice, creaking forward carefully, wavering with every sentence despite no real emotion in the voice, you might go, “Old man.” If you hear a lot of energy from a booming male voice, with words delivered confidently, perhaps with some jargon mixed in, you might think, “Middle-aged man, maybe an executive?”

The point is, every time you say anything or play anything on a podcast, the audience is creating an image in their minds. If the image creation process gets interrupted because the listener lacks details, or they receive details that are conflicting or confusing, you’ve lost them. Aside from boredom, confusion kills podcasts quicker than you can say, “Subscribe on Apple Podcasts, Spotify, Stitcher, SoundCloud, Google Podcasts, Overcast, Pocketcasts, or wherever you get your podcasts.” (Okay, so you can lose listeners way, way faster than you can say that…)

Never allow someone listening to start thinking, “Wait, what just happened? Who’s that? What did that word mean? Where are we now?”

One way to do this is to routinely deliver “signposts” to listeners. Signposts are overt transitions between two moments. In text, you can merely cut to the next thing. (This post does that, in fact.) In audio, you need to both wrap up the stuff that came before and project forward to what happens next.

Signposts can appear anywhere from big, chapter-like transitions to more subtle transitions between quotes or interview questions (meaning that, yes, you can deploy signposts on the fly as an interviewer without scripting later).

An example of a signpost between chapters or show segments might look like this:

  • “So that answers our first couple questions: What does it take to make an enjoyable podcast, and what does Jenny do specifically to make hers so darn entertaining? Next, let’s explore how to attract your first few listeners.”

In this case, we’re closing one chapter and opening another, and it feels pretty obvious to the listener. (In writing for text, you’re told “show, don’t tell.” In audio, there’s a lot of telling.) Given the signpost I just shared, listeners now know what to expect next, in part because we gave them a beat to “conclude” the stuff that came before, and in part because we overtly hinted at what comes next to renew their focused listening attention.

Signposting can be useful in smaller, less segmented moments too. Here’s an example of using one as a sort of subtler transition that still serves the same purpose of advancing the listener forward in time with clarity, not confusion:

  • [QUOTE-1 plays]
  • [VOICEOVER] “And that’s what Shane thought for years, until one day, a guy walked in the door.”
  • [QUOTE-2 plays]

In this case, we’re carrying forward some kind of story about Shane. He had an assumption based on what he just shared with us in the first quote. The signpost transitions us from one moment to the next in a few hidden but brilliant ways. First, it helps us absorb and understand the first quote better. In just a few words from the host, it cements in our minds that something was his assumption or existing belief: “And that’s what Shane thought for years.” Then, it calls our attention to a specific detail that is worthy of our renewed interest in the episode: “Until” (Ohhhh, until what?!) “one day” (Ahhh, which day?! Tell me!) “a guy walked in the door.” (Who?! Who was the guy?! And what did he say to Shane to make him change his thinking?)

In this case, the signpost advances the action from one moment to another, and it grabs us by the back of the neck to point us to something crucial. It’s saying, THIS! This is what you need to hear next. Don’t miss it! Stop zoning out. Keep going!

Now, in that particular example, the signpost was delivered by voiceover — narration written into the episode and performed in post-production. But you can also use signposts inside an interview, in the moment, so long as you think like an editor while interviewing. Here’s an example:

  • [HOST, after the subject stops talking]: So up until this point, we’ve been talking mostly about shows-as-podcasts. But shows refer to a lot more than just audio. I’d love to spend a few minutes discussing video next. You all create a ton of longer-form videos. Why?

See? Easy. Crucial! But easy. Signposts are magical.

3. Introducing subjects.

Another way to write for audio specifically, as opposed to text, is to more overtly introduce new people (or, really, any new concept at all) by giving them their own, contained moments. In text, you might introduce someone or something “in-flow,” rather than break out of the main arc or narrative. You can introduce them within the larger points you’re making. Why? The audience can literally see their name and any other detail about them. They can revisit those words instantly, too, if they trip up.

That’s not the case in audio. Introducing someone in-flow can confuse the listener by packing in too many details too quickly.

Think of it this way: With an article, you’re putting an entire plate of food down in front of the audience and letting them tuck into that meal on their own time. With a podcast, you’re putting down the plate of food, then plopping down in the chair next to them to spoon feed them every single bite, one tiny bite at a time. Did they swallow? Give them a moment. Maybe add a signpost to be sure. Okay, it’s down. Great! Next bite…

HERE COMES THE AIRPLAAAANE…

(Sorry, I have a seven-month-old at a home. Moving on…)

Here’s how I might introduce a person named Rita Smith, a marketing agency owner, in a blog post:

“[Paragraph describing the issues of today’s marketing industry.] After seeing plenty of pretty bad marketing in her career, Rita Smith decided to change how brand clients approach social media. To bring it back to basics, she started an agency called Marketing Matters. As she told me, ‘Most bad marketing overlooks the most basic of principles. When I started the agency, I decided to focus on those human elements, those first principles, and never talk about all those trends or conventional ideas.”

No big deal, right? When you just read that, it was totally fine. But if you closed your eyes and listened to a full paragraph about our industry, THEN heard those last few sentences delivered “in the flow” of that thought, I’d lose you. Audio is, again, a visual medium. When you hear a podcast, your brain constructs the visuals and the meaning behind the words. If I drop something unexpected into that, you’re thrown off course. You stop paying attention to the content and start focusing on all the questions in your mind. Making matters worse, the podcast keeps playing, and so you fall even further behind.

That’s why the subtle details that are included or omitted can make all the difference to the listener. That’s why my seemingly simple way to introduce Rita Smith written above is actually terrible for audio.

As a result of this reality, and thanks to the need to introduce a new subject in an isolated moment, we get the following cliché from NPR:

“[Paragraph describing the issues of today’s marketing industry.]

[RITA 3:21–3:22] “Most bad marketing overlooks the most basic of principles.”

[VO] “That’s Rita Smith. She’s the founder of an agency called Marketing Matters. After seeing plenty of pretty bad marketing in her career, she decided to change how brand clients approach social media.”

[RITA 4:11–4:13] “When I started the agency, I decided to focus on those human elements, those first principles, and never talk about all those trends or conventional ideas.”

That’s the rote approach in audio, especially those trained in public radio tropes: short quote + clear, quick introduction of the speaker (“THAT’S So-and-So”) + longer quote. (And of course, cue the xylophone music next…)

There are other ways that don’t follow this template, too. I might simply rewrite Rita’s intro like this to shift it from text-friendly to audio-friendly (shown line-by-line, with bolded words representing the rewrite):

“[Paragraph describing the issues of today’s marketing industry.]

“After seeing plenty of pretty bad marketing in her career, Rita Smith decided to change how brand clients approach social media. Rita Smith has seen some pretty bad marketing in her career. As a result, she decided to change how brand clients approach social media.

“To bring it back to basics, she started an agency called Marketing Matters. Rita is the founder of an agency called Marketing Matters, which has a very specific belief system. (<Editor’s Note: This clause signposts the prior quote by using Rita’s name again, and signposts the next quote by hinting at “a very specific belief system.”)

“Most bad marketing overlooks the most basic of principles. When I started the agency, I decided to focus on those human elements, those first principles, and never talk about all those trends or conventional ideas.”

Getting great sound bites is like, um, uh, well … hardCOUGH! Excuse me: It’s hard.

Unlike when writing an article, a quote not only needs to contain the right words, it needs to SOUND a certain way. When writing for text, it’s totally fine if somebody keeps up-speaking or cluttering their answers with “verbal debris.” Readers don’t hear the up-speak, and you can just write a quote cleanly, without the junk around it. But, like, yanno, um, if people in an audio interview, uh–COUGH–keep talking with that up-speak at the END? And they never come down to end on a punchy NOTE? And they say valuable things but it sounds, like, TERRIBLE? Well, now you’re DOOMED?

(PS: If you’ve ever edited a podcast before, you know how terrible the subtitle of this section would be to fix. If you want to use that line, sure, you can splice together “Getting great sound bites is” with that final “hard.” But then you’re gonna need to zoom way, wayyyyy in to the wave form to try and de-couple the “s” sound from “it’s” which currently bleeds into the “h” sound from “hard.” If only they’d just said “Excuse me: hard,” instead of “Excuse me: It’s hard.” The devil is in the details, and he’s playing my emotions for a fiddle.)

So, if getting great sound bites is so darn hard, then we need to become masters of finding moments. We can build rapport ahead of time to relax people, for instance, and we can treat our interviews more like we’re dancing with someone else than marching them forward (since, as we’ve just experienced rather painfully, they’ll often fall behind or trip sideways). In other words, great interview questions are vital in order to extract great content that both STATE the best stuff AND sound the best.

I’ve written more in-depth about the three best questions to ask (and in what order) to extract great content, and I followed that up with 9 questions that the greats like Terry Gross, Ira Glass, Howard Stern, Bill Simmons, and Kara Swisher use. I’d suggest reviewing those to really improve on this front, but for now, here’s an assortment of questions to try:

  • Tell me about…
  • How did you feel when… (or, how did that feel?)
  • Can you give me an example?
  • What changed when X happened?
  • What did you think it would be like, and what was it really like?
  • What do you say to people who…?
  • (Superlatives) Best, worst, funniest, scariest, hardest, least certain, favorite…?

Then, when ordering your episode (whether in post production or on the fly during the interview), try this flow:

  • First, establish some stakes right away (big questions, conflict, why someone would care, emotionally gripping moments left open-ended until later.) Deliver those open loops immediately. You don’t summarize the whole thing, for instance. You leave something unanswered that listeners can’t wait to close the loop on … later in the episode.
  • Second, build trust between the audience and the subject(s) early in the episode. (You want listeners to receive early moments of laughter and warmth from the guest, or some brilliant ideas, or pithy insights, or irresistible short stories. These should be delivered before you share a lengthy bio, before you dive into their full backstory, because that trust and interest must be built before listeners will commit to all 15, 20, 45, or 60 minutes of your episode. The best way to earn trust is to deliver the best stuff right up top. Then, yanno, deliver even more “best stuff” later.
  • Writing a narrative-style episode can of course put that in your hands more squarely than an interview, but even those less predictable interview-based episodes can deliver irresistible content up front, not table-setting stuff. Launch into pressing questions or key stories early on, and use the moments before you hit Record to build rapport instead. Don’t waste the audience’s time to “ease into it.”)
  • Third and finally, go deeper. Through follow-up questions and conversation or through your narration and the rearranging of quotes, you can share more of the “arc” now because listeners are now confident that this person or story is worth their time investment.

Scripting is never optional. SOMETHING must be written.

What plagues so many poor podcasts is a lack of intentionality with writing. Narrative-style shows are written as if to be published as articles, while “interview show” has become an excuse NOT to do all the voiceover and rearranging of sound bites. But audio isn’t text, and even interview shows still require some writing. It could be as simple as writing out great questions and a loose structure for the final episode. Case in point:

Additionally, all shows regardless of format benefit from writing cold opens to better entice the listener to keep going. Pull quotes (those punchy quotes ripped from an interview and placed at the beginning, sans-context) are both commodified and less likely to truly reel in the listener, given that lack of surrounding context or addicting feeling that happens when a listener is in the flow of consuming something. The exception might be when you’re talking to someone with an unbelievable, edge-of-your-seat story, but how often does that really happen for us?

Writing also helps create better endings, especially when you consider that endings are how we deliver a higher-friction request from the audience, the most important being, subscribe via email. Instead, most endings merely drop people off a cliff. They kill momentum, shrug and say goodbye, or stuff the listener’s brain with too many calls-to-action. Why not hit them with, say, a Goosebumps Walkaway? That’s where writing your outro — and ensuring it’s not copy/pasted from past episodes — can help improve both the listener experience and the marketing results.

When creating a podcast, respect the medium. Honor the psychology of the listener. Whether you’re preparing research docs on guests, scripting short moments before, during, or after an interview, or you’re shaping an entire episode, as a showrunner, you’ll inevitably find yourself writing for audio. Just make sure to embrace what makes that task different than writing something else.

The subtle differences make all the difference in the world.


Join leaders from Red Bull, Mailchimp, LinkedIn, Salesforce, Shopify, Zendesk, and Wistia who get our monthly email, exploring the craft of marketers making shows to build passionate audiences.

Jay Acunzo

Written by

founder MarketingShowrunners.com | author, Break the Wheel | host, Unthinkable podcast & other shows about creativity | keynote speaker 20x/yr