Learning to Experiment: a Not-So-Scientific Method

Dan Lee
Axial Product and Design
6 min readFeb 9, 2017

And, by definition, “discovery” means you don’t know the answer when you start.

Ed Catmull, Pixar

One of the most valuable lessons I’ve learned as a product manager is the importance of testing and validating our ideas and our work. As much as startups and their products might fail because of bad execution, I would guess that they fail just as often - if not more - because they took too long to test their ideas, learn from that, and iterate.

Over the last year, experiments have become critical in enabling the product team at Axial to learn rapidly about the ideas on our roadmap and make decisions with greater confidence.

Along the way, we’ve also learned a ton about how to run a successful experiment in and of itself. Because every company and every idea is so different, there is no perfect “recipe” for how to design an experiment, but the following key principles have helped us get better and better over time.

Identify and prioritize what you want to test

We’ve found it useful to think about organizing our experiments around key risks to the ideas and initiatives we are considering. These risks might fall into one of a few different buckets, which can be helpful to frame as a question you want to answer:

  • Value/impact risk: is this idea actually worth it? how much will it actually impact our OKRs? what are the negative side effects, and how significant might they be?
  • Usability/execution risk: how easily can someone use the design we have in mind? how valid are our assumptions that underpin the product design? what are the key points of friction we’re worried about?
  • Effort/feasibility risk: could we actually ship something in a reasonable timeframe? what technical hurdles or unknowns do we need to explore? what internal obstacles do we need to anticipate and clear?

At the outset of our last planning cycle, the Axial team brainstormed all of the risks we could think of about our upcoming product initiatives. We then ranked these based on the “size” of the risk (what are we most scared of not knowing in three months from now?), and prioritized the ones we wanted to de-risk by the end of the upcoming quarter.

We had a time-boxed individual brainstorm, shared and grouped our ideas by theme, and each dot-voted what we thought were the three biggest risks

If your team is aligned and thinking strategically about your roadmap, you’ll hopefully find that the risks on the board are pretty clearly aligned around the goals of the company and product strategy (if not, there’s something askew).

We then created a living risks “dashboard” to externalize within the company what they were, our ideas for testing them, and outcomes from those experiments:

Gray circles = unknown/untested, green = positive result, red = negative result

Each gray circle in the dashboard represents a risk we haven’t yet tested, while greens represent “positive” results (“users are opting into XYZ”) and reds present “negative” results (“no one signaled an interest in ABC”).

While it’s tempting to see a lot of greens on the dashboard and think that equates to success, what’s most important is to have as few grays as possible (though being right more often than not is still a good thing!). Our goal as a team, accordingly, was to “cover” at least 80% of our risks and turn them either green or red. Remember that the goal of the experiments is to learn about and de-risk ideas - so it’s just as valuable to know something is not worth doing as it is to find positive signals.

Set goals that tell you if you’re right, wrong, or need to keep going

When it’s time to actually design and run an experiment, start by thinking what outcomes you want to see that answer the questions that prompted you to run an experiment in the first place. While it might seem obvious, it’s worth emphasizing the importance of defining the experiment’s hypotheses and what will tell you if you’ve succeeded or failed. You might otherwise lose sight of what you’re trying to answer and let distractions pull you in the wrong direction, and end up with meaningless results.

Depending on the nature of the experiment, the goal(s) might be quantitative, qualitative, or both. You might be looking for measurable signals that tell you how much users value a certain product idea (testing for value), or you might want to learn whether your product design and UX is intuitive and user-friendly (testing for usability).

A caution against overengineering quantitative goals: avoid the temptation to create “the perfect metric” or arrive at a target with exacting precision. You won’t know enough about what you’re testing to get it perfect anyway (hence the reason for the experiment). The goal should be just rigorous enough to give you directional confirmation that you’re right, wrong, or need to iterate and keep going. Experiments tend to be small, manually run, and limited in scope by design - that’s what enables a team to launch them, learn, and iterate on them quickly. Goal-setting should be approached much the same way.

Relatedly, you should also set some goals around the timeframe and intended size (and personas, if relevant) of the audience for the experiment. Again, how you define these is rather subjective and depends on the nature of the experiment - but generally speaking, think about these two inputs as levers you can adjust to optimize for learning as fast as you can.

Axial uses a template to formalize the goals and design of our experiments. We’ve made a public version of this available for download.

Hack it, hack it, hack it

Running an experiment also requires being comfortable with hacking something together as quickly as possible. It’s just not practical to try to replicate the envisioned product and user experience with 100% fidelity, which would defeat the purpose of the experiment. While there are varying tactics that are more conducive to different experiments, every experiment should be hacked together and launched as cheaply as possible.

In his widely read essay Do Things That Don’t Scale, Paul Graham talks about the most “extreme” version of this concept - doing things manually (emphasis mine):

There’s a more extreme variant where you don’t just use your software, but are your software. When you only have a small number of users, you can sometimes get away with doing by hand things that you plan to automate later. This lets you launch faster, and when you do finally automate yourself out of the loop, you’ll know exactly what to build because you’ll have muscle memory from doing it yourself.

When it comes to usability tests, for example, design clickable prototypes when you can and develop a working prototype (only) when you must. When our design team began to re-design how our users classify the industries in which a business operates, we first prototyped the experience in InVision. We quickly grew concerned about its usability, and realized that the only way to test the search-heavy UX was for an engineer to build a lightweight, hacky prototype. Even then, we had to further scope down the prototype to a minimally functional experience that let us test what we wanted.

On the other end of the spectrum, there are times when it doesn’t even make sense to use design time for an experiment. Several months ago, we wanted to test whether users would be willing to update their profiles if we pre-populated them with data and asked them to confirm or update their profiles. We first tried using TypeForm to create an online form asking users questions about their profile, but felt that the experiment wouldn’t work without actually showing users what data we already had about them. We ended up switching to manually creating static mockups and Excel files with custom profile data for each of the twenty users we targeted.

In both of these examples, in addition to trying to hack together the quickest but still useful way to test our hypotheses, we ended up iterating on the design of the experiment itself as we learned about what might or might not be the best approach.

Picasso‘s process of iterating to a finished “product”

Experiment to learn, learn to experiment

If it isn’t clear by now, running experiments is a learning experience of its own. We’re constantly looking to sharpen our approach and thinking around testing our ideas - if you have any other stories or insights to share, please comment below!

--

--