Notes on Designing This American Life’s “Shortcut” Tool

The internet is rife with tools to clip, remix, and gif visual content: we make reaction gifs of our favorite TV shows, screenshot articles to highlight compelling ideas, and have gif buttons built into many of our social and work networks.

Sharing audio, however, is a tougher nut to crack. Delaney Simmons, in a Nieman Lab article about shareable audio, mentions how, in radio, there’s a “unique problem in that our content isn’t necessarily shareable.” At the core of shareable audio is a set of interconnected design challenges: how to allow listeners to skim podcasts, zero in on the stories and moments they want to share (and support multiple kinds of moments and sharing), create an eye-catching visual component that expresses but doesn’t overshadow the content, and — in the end — let the listener express their fandom the same way a gifmaker on Tumblr might.

In working on Shortcut, an audio sharing tool built in collaboration with This American Life, The Tow Center for Digital Journalism, and The Knight Foundation Prototype Fund, we’ve had the chance to learn from listeners, podcast creators, and prior experiments in shareable audio to develop a tool that answers some of these questions, and opens up more space for other tool builders to explore.

Our design thinking

Our vision for Shortcut — which has remained relatively steady since the start — was a tool that would allow users to access a podcast archive (currently, This American Life), quickly jump through an episode to find their favorite moments, convert those clips into beautiful, transcribed .mp4 videos, and share those videos on social media. (You can read more in Stephanie Foo’s article here.) The very first step in our design process was to take stock of the existing landscape, as a way to catalyze our thinking and start solving the right problems.

We broke the exploration up into several facets: other experiments in shareable audio, other experiments in shareable content in general, and examples of emergent fan culture across the web. So, in addition to audio-based sources of inspiration like Clammr, WNYC’s Audiograms, NPR’s experiments in viral audio, and Audible’s Clips, we also explored the larger design space, like “Why We Love Gifs,” Giphy, Vine, histories of gif culture, research on “emotional subtitles,” Blingee, and more.

Ultimately, we came down to a set of questions and priorities for the kind of tool we wanted to create, namely:

  • How can we reduce the friction of hearing a great podcast moment and sharing it? In one of our user interviews, our interviewee outlined their many-step process for clipping podcasts, which included taking a screenshot of the timecode, uploading the photo to iPhoto, getting the podcast on desktop, uploading the full podcast to a clipping service…
  • How can we support multiple kinds of listening and sharing experiences? Namely: what sorts of clips are people going to want to make? From the length of the clip to the visual design of the resulting audiogif, we knew that subtle choices would indicate that the tool was primarily for one kind of sharing. But what other forms of sharing could we support other than clipping a story wholesale? What if a user wanted to clip a super-short inside joke? A sound effect like a distinctive laugh?
  • How can we guide this tool away from being seen as a generator for promotional materials, and towards a tool for users to express their fandom? This is what many of our design questions ultimately boiled down to. We found ourselves continually drawn to the question of how the user could use our tool to not just share a portion of a story, but express their fandom of the show. After all, the creators of — say — a gifset on Tumblr aren’t necessarily trying to proselytize: they’re expressing themselves and connecting with other members of the fandom. In perhaps nerdier terms, we were less interested with the possibility of enabling user-created promotional materials, and more interested with creating the podcast version of Darmok and Jalad at Tanagra.
  • How can we do this for a variety of podcasts and listening environments? On top of all this, we knew we wanted the tool to be to “gracefully degrade” for a variety of situations: a tool that could take advantage of podcasts that had metadata like transcripts but also support those that didn’t, that could support customization but not require podcast developers to bake it in, and that could map well to mobile, for on-the-go clipping. Phew!

Our process

With a long list of design questions, we started rapidly iterating on wireframes. In order to support different types of use cases, we settled on three styles of browsing to play with:

  • Moving/seeking through the episode quickly, at a large scale
  • Moving/seeking through the episode through text, at a large scale, instead of by timecode. (This was a particular affordance of This American Life’s beautifully transcribed archive.)
  • An easy-to-use fine-tune clipping component, for small-scale changes.

You’ll see a lot of these elements repeated through the following wireframes, but shifted in subtle and important ways.

It’s worth pointing out that this design process was an extremely collaborative one. Every team member — regardless of their official role — provided invaluable ideas and insights along the way.

An initial wireframe. There’s an abundance of input options, from typing in a timecode to reading through a transcript to scrolling through a waveform. We had thought it might be useful to put the fine-tune waveform clipper at the top (as Clammr does), since folks would ultimately be using that functionality the most. However, in our case, we realized it threw off the visual hierarchy: going from timecode (broad) to fine-tune (focused) to scrubber (broad) to transcript (broad, but more focused than the scrubber) was confusing and unintuitive.

A Giphy-inspired version. Here, the user chooses their starting point, and — rather than clipping with brackets — simply selects a duration, making fine-tuning basically the only interaction. Since the transcript was down to the second, we also played around with the idea of letting you “snap to” the end of the sentence, as you might snap to a grid while arranging items in Photoshop. This didn’t include any transcript search, which is a feature we ended up really liking and prioritizing, so we scrapped this.

Our first working prototype. You’ll notice that the hierarchy on the right goes from broadest to most focused (episode scrubber to fine-tune clipper). We’ve added a customization dropdown as well, letting users choose the animation type they want to use for their text. We included all the share options on this page, which we found felt somewhat cluttered.

A more final wireframe. You’ll notice that we’ve winnowed down the 6 (!) editing tools in the first wireframe to just three, and put them in the hierarchy laid out before (from broadest to most focused). Sharing options have been removed from this screen and moved to the preview page. We’ve also switched the side the video preview appears on. Customization has been streamlined from a dropdown (which takes up a not-insignificant amount of space, and won’t necessarily work well on smaller screens) to two arrows arrows on either side of the video — almost Snapchat-like — which cycle through different styles (bouncing, scrolling, etc.).

A napkin sketch by Stephanie after our beta tests. You can see the mobile version shifting into a multi-step process: rather than trying to jam everything on just one or two screens, this gives all the elements room to breathe.

The final product! Jason, one of the developers of Shortcut, came up with the idea to let people click and drag within the transcript to highlight the segments they wanted (rather than having to select whole segments and then fiddle around with the fine-tune clipper). His change was a huge improvement, usability-wise. Eve Weinberg designed beautiful animations that helped showcase the text and give it life.

The final product on mobile. Note how it grew out of the prior napkin sketch. Typically, on mobile, users would have to long-press to select the text they want. Mobile operating systems don’t give web developers the access we’d need to take advantage of this feature (perhaps this is why sites like Medium don’t offer their text selection features on mobile). Fortunately with Jason’s selection code, users don’t have to fiddle with it. They can just tap the first word and the last word to select all the text in between.

Our next steps

The testing process for Shortcut has been a multifaceted one: even aside from QA testing, we need to make sure that the tool feels good, is intuitive for different types of users, works for a variety of types of podcasts, and — hopefully — supports many types of user expression. This is an ongoing process as we refine the tool for its open source release (email web @ if you’re interested!).

We’re happily not the only ones tackling this problem for podcasters — New York Public Radio’s Audiogram offers a self-hosted solution where users can upload any piece of audio, and Pop Up Archive is developing a Clipmaker for the archives.

We hope our research can inform these likeminded projects, and that Shortcut can offer a unique model for audio creators who want to make their archives more shareable.

Team Shortcut:

  • Stephanie Foo — Project Lead
  • Courtney Stanton — Project Manager
  • Darius Kazemi — Developer
  • Jason Sigal — Developer
  • Jane Friedhoff — UX Designer
  • Dalit Shalom — UI Designer
  • Eve Weinberg — Motion Graphics

Jane Friedhoff is a game designer, creative researcher, and experimental programmer whose work focuses on pushing the affordances of a given medium to create new, unusual, and playful relationships between people. She currently works at the Office For Creative Research, and before that was a creative technologist at the New York Times’ R&D Lab, where she developed journalism-oriented experiments like Madison and Membrane.