Design Thinking for Machine Learning

Or How to Build Products with a Giant Probability Calculator

How would the ability to predict the future affect your design thinking? If you knew exactly what your users were looking for, you would be able to build clean interfaces — giving your users only what they need, and only when they need it. The holy grail of design.

Machine learning allows you to predict this future — well, almost. It takes in your question and gives you an answer, along with a probability of that answer being correct. For example, is John Doe thinking about going to Greece today? The answer — yes, 90%. So you take the gamble, and next time John Doe comes to your website, you show him an awesome apartment in Greece. John is happy at how easily he found a great place to stay, and hopefully becomes a loyal, happy customer. Explode your 90% probability to a large number of users, and you would be right at guessing where they want to travel 90% of the time.

Understand the implications of this probability

Being right 90% of the times also means being wrong 10% of the time. Furthermore, the more confident a machine learning model is about its predictions, the fewer predictions it makes. In order to develop a firm grasp on the concept, let’s consider how you, without a machine learning model, would go about guessing if Jon Doe was indeed traveling to Greece. Let’s start with a benchmark — you want to be right 50% of the time. In order to make a prediction about Jon Doe, you toss a coin — if it’s a heads, he wants to go to Greece. If it’s tails, he doesn’t want to go. Given enough number of users, you will be right for 50% of the users.

But 50% is not good enough for you. You decide to show Greece to Jon Doe only if you’re 65% confident, so you start asking some serious questions to make your guess better.

Does John Doe like beaches? — Yes

How many times in a year does he take a vacation? — 3 times

How many times has he already travelled this year? — Twice

These questions inform your guess a little more, and now you are more confident that Jon Doe might actually be traveling to Greece. Let’s say your estimate is now 70% accurate about this travel. Jon Doe will now get to see Greece on the website.

Now consider this scenario:

Does John Doe like beaches? — I don’t know

How many times in a year does he take a vacation? — 3 times

How many times has he already travelled this year? — Twice

Given that you don’t know if Jon Doe enjoys beaches, you are now only 60% confident, so you decide that Jon Doe does not get to see Greece.

There is a profound implication to this tradeoff — the more confident your prediction, the fewer the predictions you make, mostly because you don’t have the range of data required for making a confident prediction for all your users.

Precision and Recall

Data scientists like to refer to these terms as ‘precision’ and ‘recall’. Precision defines how right you are about a prediction, and recall defines how many predictions you make. So, the higher the precision, the lesser the recall. In general, precision and recall both contribute to the accuracy of a ML model. The accuracy for any given model is constant. This means that we can graph the precision and recall tradeoff:

Precision and recall for two different algorithms generated by two different ML models — recall 1 = all data points are predicted with 50% precision with algorithm 1 and 60% precision in algorithm 2.

Higher precisions give lower recall and lower precisions give higher recall as a rule of thumb. Bumping the model up from one accuracy to another yields a higher precision as well as a higher recall. A data scientist’s job is to generate a model with the highest possible accuracy. As a designer, however, you cannot change this accuracy, and must play with precision and recall.

The precision and recall you choose for your model have a profound impact on the kind of product you build. For our earlier example, if I set a precision of 99.9%, I will show only properties in Greece to all users whom the model predicts are likely to go to Greece. However, if I choose a 60% precision, I would merely highlight Greece as one possible option, while still allowing the user to navigate to other destinations. The more drastic a model’s predictions are, the more important it becomes to be mindful of this tradeoff. If I was predicting how likely it was for John Doe to cancel his booking to Greece, at a 100% precision, I would cancel his booking even before he asked me to do it. If I was 90% confident, I would notify him to confirm his travel plans.

Spam filters on your inbox are a great example of how the precision and recall tradeoff has been used for product design. Some emails are directly sent to your spam folder:

This happens when an ML model is very confident that the email is spam — a very high precision.

Some emails come with a note attached:

This message might be spam. This happens when the model is less confident — low precision.

Even with these two different precisions, the recall is not 100%, and there will be cases where no prediction is made. The designer gives you an out in these cases — ‘Report Spam’.

Quick replies are another example of a precision — recall tradeoff. The model is aware that a response is needed, and also the tone of the response. However, it is not precise enough to know exactly what reply is needed, and hence you can choose from three available options. It makes replying to messages easier. It also gives you an escape, in case the prediction is wrong — simply ignore the reply suggestions and type your own!

Booking.com’s recommender systems make extensive use of the precision-recall tradeoff.

Recommendations based on my recent searches for India
Based on my interests, around where I live
Based on my travel history

Important to keep in mind here — AI is a tool for solving user problems, and should be treated exclusively as such. There is no ‘AI first’ product — products should always be ‘user first’.