Rethinking audio editing on mobile
How the Anchor product team approaches traditionally complex audio functionality and makes it fun and easy to use.
Since we launched Anchor 2.0 earlier this year, one of the most common questions we’ve gotten from creators has been “Why can’t I edit audio before adding it to my station?” Editing indeed has always seemed like an obvious (and much-needed) feature that should exist alongside all of our other functionality: recording, adding songs from Spotify and Apple Music, taking call-ins from listeners, remote interviewing, etc. So what took us so long?
Given our goal of democratizing audio creation and making tools available to anyone and everyone with a smartphone, we’ve always wanted creation on Anchor to be fast, frictionless, and so easy that anyone can use it. It’s critical that Anchor’s tools are immediately intuitive for everyone, and that they are designed specifically for mobile use.
These requirements made designing an elegant editing solution more difficult than we anticipated. Here was our first attempt, which looked a lot like a standard audio editing tool:
This approach was a non-starter for a few reasons:
- It’s technical. It requires creators to understand what a waveform is, and what the peaks and valleys represent. We’re trying to help everyone here, not just people with existing expertise in audio editing.
- There’s no context. Given Anchor’s current content focus on spoken audio, a standard waveform carries nearly zero relevant information to the creators about where and when things are being said in the audio.
- It’s really hard to use on mobile. To edit with this interface, a creator would have to make precise adjustments on a very small canvas (a mobile phone screen). Not ideal.
This seemed like a good opportunity to rethink the editing model and design a new type of interface for interacting with and modifying audio content. So we thought. A lot.
Our ideas kept coming back to the same core questions:
- How do we represent audio in a way that everyone will understand, and that will actually help in the process of editing it?
- How do we make the actual editing process dead simple? Anything that involved dragging or tapping a very precise point within a waveform seemed ridiculously overcomplicated.
As we dug into this further, we realized we had an extra piece of information about the content that might just be the silver bullet we were looking for: word-by-word audio transcription.
When we introduced our Anchor Videos feature a few weeks back, we started automatically transcribing audio so our users could easily convert their segments into something digestible and shareable on social media. It occurred to us that this same transcription could be leveraged to design the simple, intuitive, mobile-first experience we had been looking for. After all, when you’re trimming most audio, what are you really doing? You’re deciding which words or phrases you want to include and which you want to exclude. So that’s what we did.
Starting today, you can now edit call-ins and other people’s segments before adding them to your own station or podcast. It’s simple:
Before adding your audio, choose a starting word and an ending word, and you’re done.
We’ll discard the rest of the audio and just add the part you want your listeners to hear. It’s that easy (and it really does work great on mobile, even if you’re on the go).
We’re thrilled with this first step towards making audio easier to edit on mobile, and we can’t wait to see what new kinds of creativity this feature unlocks.
And yes, we will be adding more editing features in the near future, and we’ll approach those tools with this same philosophy: audio creation on Anchor will always focus on intuitiveness, and the actual content you care about — not a technical representation of the audio file that happens to contain it. We believe this approach is critical to moving audio forward as a medium and we’re excited to continue opening the format up to the masses.