iOS, built for self-expression

The iPhone is a wonderful device for expression. It comes with multiple cameras, microphones, a touch-sensitive screen, and millions of applications that let you express yourself in entirely new ways.

Self-expression is something Apple has continually invested in. In iOS 9, the Notes app got a much-needed makeover. With the iPhone 6s, Apple announced Live Photos. And just last week, Apple quietly released Music Memos into the App Store.

We continue to get new and better ways to express ourselves. But what we really need is faster ways to express ourselves.

When it comes down to jotting down an idea, nothing matters more than how quickly we can do it. This is what Don Norman calls the “Execution Gulf” — the separation between a mental state (intention) and a physical state (action). Closing that gulf is the responsibility of good product design.

App developers are severely limited here because apps themselves sit somewhere on the home screen, which is gated by the lock screen. Just finding an app takes multiple swipes.

Only Apple controls the “full stack” of steps between intention and action, from hardware to operating system to application.

Surprisingly, Apple has not taken advantage of this power.

In fact, the last significant update happened all the way back in iOS 5, when Apple introduced a way to quickly access the iPhone’s camera from the lock screen.

This update speaks to the power of immediacy: it’s the only reason I still use the native Camera app.

As a thought experiment, let’s imagine how we could improve iOS to further facilitate rapid expression.


For starters, “designing for expression” can have a daunting ring to it. I’ve always found it helpful to remind myself there aren’t actually that many ways for us to express ourselves on the phone. We can count the primary forms on one hand:

  1. Drawing
  2. Writing
  3. Taking a photo
  4. Recording a video
  5. Recording audio

Importantly, each of these forms of expression has its own particular set of gestures.

For example, we usually start drawing by dragging our index finger along the screen; what’s called a pan gesture.

In contrast, we type with a tap — or many taps — on the lower third of the screen. Usually with our thumbs.

When we take a photo we hold the phone almost perpendicular to the floor, and tap the circle in the bottom of the screen.

Same thing goes for capturing a video, except this time we tap and hold the circle.

Recording audio is the least common form of expression on the phone, but notice that the iPhone has two volume buttons. As a sort of callback to tape recorders, we could press both together to start recording.

Critically, a user’s gesture leaves no ambiguity as to which form of expression she intends.

This matters because it allows us to immediately start executing that intention.

For example: a long press on the screen would start recording video.

Multiple taps on the screen would open a text editor, and those taps would be registered and outputted as keystrokes. (Invisible keyboards work!)

Panning a finger would open up a canvas and register the pan gesture as a stroke of paint.

In short: there would be no steps between ideation and action. Expression would be immediate, and the Execution Gulf would be closed.