Wonderful Wizard of Oz

Waterfield Tech Conversational AI

2 min readDec 19, 2022

Research by IBM found that the cost of fixing errors in development is 6x what is cost during design.

15x to fix them in testing. It’s 100x to fix them in live. (You can see the original research here)

So how do you spot errors during design?

Design standards help.

Good documentation standards help.

Peer review helps.

But nothing beats testing with end-users.

The question is: how do you test something that hasn’t been built yet?

The wonderful Wizard of Oz has the answer!

In the classic movie (spoiler alert), the fearsome robot monster that appears at the end of the film turns out to be a puppet controlled by the Wizard of Oz himself. That’s where the name for Wizard of Oz testing comes from. It’s often abbreviated to “WOz” testing.

In a Wizard of Oz test, you invite end-customers to participate in a test of a new system. You give them some background on what they need to do and then let them interact with the system.

Except it’s not a real system. It’s just a design and a designer or WOz operator with tools that allow them to simulate the actual system.

With conversational AI, it’s actually much easier than with graphical user interfaces. The designers can specify the key dialog steps and what the system should say or do at each step based on different responses from the customer. We typically invite participants onto a Zoom call and use pre-recorded audio prompts or chat messages that we load into a prototyping tool.

What’s so great about this approach is that you can get extremely valuable feedback very early in the delivery process for a pretty low cost / effort / time commitment.

We use this approach at the end of Design Sprints to test the designs we co-create with customers. And in designing the experiences for our Virtual Agents as a Service offering.

Done right; it’s fast, efficient, and very cost-effective. You don’t need dozens of participants. User experience and usability testing pioneers at the Nielsen Norman group recommend just 5 participants. Our experience backs this up. And iteration is key. Test with 5. Iterate on your design, and test again if necessary.

But often, we get enough confidence in a design from one test. We recently ran a usability test with just 5 participants as part of a Design Sprint for a chain of healthcare clinics. When we surveyed the stakeholders on their confidence in the design, they gave a unanimous 10/10.

Now that’s what I call confidence.

Are you that confident about your IVR or chatbot experience? If not, get WOz testing.

And if you’re struggling to get from idea to implementation. Consider doing a Design Sprint.

Wonderful Wizard of Oz

Written by Waterfield Tech Conversational AI