How we designed a great mobile app by obsessing about “time to insight” and in-field prototype testing

Julian Harris
Knowcast
Published in
4 min readJul 4, 2023

600+ video clips from the field told us about the problem and who has it. But how do you reliably design and test a mobile app in the field that captures authentic behaviour?

“Price before product” isn’t useful yet: we’d established that we can’t get useful signals just from potential feature lists when the features are unfamiliar.

So we can’t just survey our way out of that situation. We need to test designs, and we need to test them where people use it.

We went with this design assumption: users need visceral experience of a feature to be able to provide feedback or rank it against other features.

Drawing from The Lean Startup, I was obsessed by the key output of this stage in the project to be “prototype insights” and key performance indicatory being “time to insight”.

Specifically I excluded engineering code. I told Andre to “set the software engineering speed dial to max and quality dial to min” to build prototypes quickly and optimise for fastest insights. We were going to throw away a lot (if not all) of this code.

For our context of use, people on the go — running into forests and poor signal areas — this turned out to be a major problem.

How do you prototype quickly in this context?

Background: conventional UX design / testing can give you insight in a day

These days, it’s amazing how quickly you build tested, highly user-friendly designs.

  • Design something specifically for user feedback. An “MVP”. Could be as little as half a day.
  • Get users to try it out and feed back. Again, with various user testing networks like Usability Hub, you could get feedback within hours.

So what about voice input? Can we draw from the (relatively recent) best practice in the voice assistant space?

I asked Adam Banks, a Xoogler mate who created all of Google’s usability testing labs. I used the London one a bunch. It’s awesome. He runs a UX lab company and has experience with voice support. Talk about gold reference advice! Unfortunately ux labs even for voice are basically “get the user to sit down and start talking”. Even the mobile ux lab offering is more “sit down in a cafe”, not “sit on a bicycle”.

So we had almost no infrastructure to help.

We were on our own.

The first technical functioning podcast prototype was helpful, but not enough.

While we were doing the diary study, I asked Andre to build some back-end capabilities to transcribe podcasts. Regardless of what we did with it and what the front-end experience was, we needed this core capability. This also meant we were able to build a front-end fairly quickly with some useful test data (podcasts people actually wanted to learn and take notes from which was critical).

First key insight from the next prototype gave us this: screens really are useless for this job to be done. People are on the go and the screen is mostly unavailable.

I added some noise to the process by happily taking notes on my bike and tapping a screen. I rode my bike in a safe area where I could glance at the screen, and thought I was rather clever.

Unfortunately testing showed categorically that no other users could use their screens 95% of the time 🤦‍♂️. Sitting in public transport was one case where it would work but all others would not use a screen, almost ever:

  • Travelling
  • Chores
  • Exercise

Looking back at our notes, we could actually see these frustrations in the diary studies as it was impractical, dangerous, or downright illegal to use the screen while on the go.

Could we use our prototype with simulated voice input? Not easily.

We did an early experiment with one user (thanks Kris) where she went for a run in Golden Gate National park (I’m in London to be clear). We did this:

  • Give Kris the podcast prototype
  • Also ask Kris to dial into Google Meet with screen sharing so we could capture what Kris said as she tried to capture her thoughts while running

The mix failed. There were just too many things going on. Connection would drop. We couldn’t hear properly. It was a mess.

Key insight: our app context is in a very technology-hostile. Reception is often poor and the user is moving around all the time. We found even that moving between wifi and cell networks introduced reliability issues (see technical footnote in a future piece).

Hang on, couldn’t we just use screen capture tools like uxcam?

No, they don’t capture voice. Voice design is not as mature (as of writing, July 2023) and it’s too niche for most products.

We needed to add voice input to our prototype.

It had become clear that we needed to build voice input into our prototype. This was becoming quite a sophisticated prototype indeed.

Read how we did this.

--

--

Julian Harris
Knowcast

Ex-Google Technical Product guy specialising in generative AI (NLP, chatbots, audio, etc). Passionate about the climate crisis.