Designing with Machine Learning

Jonathan Remulla
Curiosity by Design
10 min readJan 21, 2020

Writing and sending your first survey can be an anxiety-filled experience. We tried to answer the question, “How might we be able to use our data to reduce the anxiety customers experience and increase confidence when creating a survey for the first time?” In this real-world case study, I’ll share our journey of designing with machine learning and the value it adds to the SurveyMonkey customer experience. We’ll touch on the end-to-end process, what it was like working with data scientists for the first time, how we validated solutions, and how we see the future of designing with machine learning at SurveyMonkey.

Janice just started her job at a brand new start-up three weeks ago. She’s the very first customer success specialist and wants to make a good impression on her new bosses. They haven’t done any research to understand what customers are saying about their product, so the founder suggests using SurveyMonkey to send a survey to their 5,000+ clients …tomorrow. “Sure!” says Janice, through the teeth of her contrived smile. The problem is Janice has never created a survey before, let alone sent one to 5,000+ people. Janice thinks to herself, “Holy cow, how am I going to do this?”

Fear, anxiety, lack of confidence — all common emotions new customers of SurveyMonkey experience when writing their first survey. You can’t un-send a survey, so customers need a way to confidently create a professional survey that will return valuable data that can inform better customer understanding or to make a decision.

My product manager and I tried to answer three questions:

  • How might we leverage existing SurveyMonkey expertise and data?
  • How might we increase confidence and reduce anxiety?
  • How might we automatically create a professional survey for users?

SurveyMonkey has over 60 million registered users. Of those, 16 million are active users. We receive over 20 million responses per day. We definitely had the data we needed to begin designing features driven by Machine Learning.

What is the difference between AI and Machine Learning?

Artificial Intelligence, or AI, has been at the tip of our tech-tongues for a while now. There is still a lot of confusion around the difference between AI and Machine Learning. In some instances, you’ve probably heard them used interchangeably.

AI is the broader concept of machines being able to carry out tasks in a way that we would consider “smart.” Machine Learning is an application of AI based around the idea that we should really just be able to give machines access to data and let them learn for themselves (Source).

AI = The Concept. ML = The Application of AI. For SurveyMonkey’s purposes, we set out to use ML to provide predictions and recommendations.

The ‘Genius’ feature that started it all

In early 2017, we created a feature called ‘SurveyMonkey Genius’ which scored a customer’s survey and provided an estimate for the survey completion rate and time to complete. At the time, completion rate and time to complete were valuable insights we could confidently provide customers.

When we released the feature, it was a hit because it showed a massive 10% increase in survey deploy rate (deploy = a survey sent that returns at least 5 responses). We helped make users feel more confident, but we knew we could do even more. Our next step was to build off of this win and figure out a way to provide meaningful value throughout the whole survey creation process.

Next up, question type prediction.

Over 30% of customers already have their questions typed up even before they start using SurveyMonkey for the first time. They have an idea of what they want to ask. But SurveyMonkey has about 20 different question types (Multiple Choice, Comment Box, Matrix, Matrix of Dropdown Menus, etc.) and a lot of the time, customers don’t know which to choose. Sometimes, they might choose a question type that’s not quite right for their use case, which can affect the quality of the answers they get back.

We set out to design a feature that could recommend the right question type for you.

SurveyMonkey is able to predict the right question type with 65% accuracy.

Our team for this feature consisted of a product manager, a data scientist for a couple hours a week, and a front-end engineer. With over 16 million active users, we knew we had mountains of data about the question types they selected and the actual text customers typed into the question-text input field. The data scientist analyzed the content of the input text to develop a data model that could predict the right question type for customers.

A data model is what data scientists create by analyzing data to make a prediction or recommendation.

When we launched this in Q1 2018, we were able to predict the right question type with 65% accuracy. We reduced the time it took for customers to write a survey and increased their confidence because we provided a prediction that SurveyMonkey can stand behind.

One step further: populating answer choices

It takes time for customers to type in a 5-point answer scale like ‘Very satisfied, Satisfied, Neither satisfied nor dissatisfied, Dissatisfied, and Very dissatisfied.’ Go ahead, time yourself. I clocked in at about 47 seconds, including correcting typos. Multiply that time by about 9 questions and you’re looking at 7 minutes.

Since we’d already predicted the right question type, how might we automatically fill out answer options? This was a no-brainer, but we had limited data. We never tracked and tried to correlate question text, question type, and answer text. We needed to build a feature to create a Scale Effect to collect the data we needed to inform a data model to make confident answer recommendations. This feature would also need to deliver customer value.

How we reached the scale of data we needed

To create this Scale Effect, the data scientist gathered the top answer-scale options, about 15 in all. Early Q2 2018, we launched a feature called ‘Answer Genius’ which allowed customers to select an answer-scale from a simple drop-down list. Every time a customer selected an answer-scale from the drop-down, it was a piece of data for us. We were building our mountain of data — and it took until the end of Q2.

A Scale Effect means you need a large amount of data for Data Models to be considered confident in their predictions and recommendations (Source).

Once Q3 rolled around, the data scientist was able to design a data model that suggested a single answer-scale option that was correlated with the customer input question-text. Shortly after, we launched an ML-powered answer-prediction feature.

What if we could automatically build a survey for our customers?

SurveyMonkey has over 100+ expert-written templates for customers to use. New customers just don’t know which one is right for them. So at the end of 2018, we began working on an MVP that guides customers to select the best possible survey. We called it ‘Build it for me’ mode. This would allow us to collect the data we needed. However, it was risky because we were significantly disrupting the flow.

To build this feature, we had a content strategist and a survey researcher on our team. ‘Build it for me’ mode asks customers questions about their audience, survey goal, and use case. Based on those answers, we create the best-possible template for them. We also built a ‘Genius Assistant’ panel that makes recommendations and guides the customers through the rest of the survey-creation process. However, the ‘Genius Assistant’ was not built into the survey-sending process.

Screens from the first version of the ‘Build it for me’ mode.

The content strategist and survey researcher were key members of this effort. The feature contains 57 possible use cases, and 52 possible recommendations, but the user never needs to compare and choose between more than 6 items at a time. To keep the experience this simple, they created an intricate information architecture and content map that ensured that all the recommendations shown to any given user made sense in sequence, and were relevant to their audience and use case.

We released the ‘Build it for me’ MVP in early 2019 to a small percentage of customers. As we waited for quantitative data on usage, we did qualitative research in parallel with new customers.

The results

Of the new customers that reached the selector screen, 37% selected ‘Build it For Me’ while 16% selected ‘Build it by myself’. This proved that new customers did want guidance with their first survey and believed they would receive expert help from SurveyMonkey.

We knew we couldn’t account for every single use case, so we designed an ‘Other’ screen where customers could input the goal they were looking for. Of the customers that came to this screen, 66% submitted a goal. This provided us insight into what future goals we could support and also alternate namings in case customers got confused with the names we provided.

Through qualitative research, we found that they needed guidance in the send survey process, too. Also, most customers overlooked the ‘Genius Assistant’ panel altogether on the survey-creation screen. The dark color was the primary reason.

‘Build it for me’ v2

We applied the learnings to the current experience, today.

Improved the ‘Build it for me’ UI and made it a crisp white. We also improved naming and categorization based on usage data, and added a few more use cases based on user feedback submitted through the ‘Other’ screen.
Added ‘Genius’-recommended send methods based on the customer’s audience and use case.
Applied the ‘Genius Assistant’ panel to recommend collector settings (settings for sending surveys), based on the customer’s audience and use case.

Since, we have turned it on to 100% of new customer traffic, it’s been a success:

  • 37% of new customers use ‘Build it for me’
  • 90% of customers complete ALL Genius Assistant recommendations after clicking on one
  • 95% of customers choose a recommended send-method

It’s coming up on a year, and we are still waiting for our Scale Effect before we even begin thinking up a data model for a truly ML-driven ‘Build it for me’ experience. There are so many more variables compared to question type prediction and ‘Answer Genius.’ This means we’ll need even more data.

Takeaways

At first I assumed that you had to be good at math to design with machine learning.

I’m terrible at math and initially felt I was going to be in over my head when I first met with the data scientist and team. Luckily, it wasn’t as scary as I thought. It was an iterative process, and you need to have a team of specialists around to take the lead during the different phases of design and engineering.

ML is still very young. I first thought we could build this British-accented virtual survey butler. The reality is that even very, very basic recommendations require a ton of data to pull off. On top of that, as your ML-driven features get more usage, you need to retrain your data models to provide even smarter and accurate predictions and recommendations (I had no idea.)

Nevertheless, it’s been a pretty cool 2 years designing with ML and I’m so happy to share some of my learnings.

Learnings

Your company needs to be resourced for this

  • We had a whole team of data scientists
  • SurveyMonkey has a ton of data constantly flowing in (2+ million responses PER DAY)

Start with what you know

  • We already had definitive data to inform our Data Models
  • We could confidently make our first predictions
  • Gave us confidence that we could, in fact, pull off ML in the product successfully

Have a vision

  • Do an exercise where you explore predictions and recommendations in-product
  • Make sure you’re adding value and solving customer needs
  • Plot out your roadmap of features

ML doesn’t always need to be seen

  • For productivity software, customers are trying to get something done
  • Sometimes it should happen in the blink of an eye
  • Sometimes you roll out the red carpet to show how cool your ML really is

It’s an incremental design process

  • Build your features strategically
  • ML takes time
  • We needed to build MVP features to gain the Scale Effect data we needed to develop true ML-driven features

Understand that it will take a true cross-functional team effort

  • Grew a lot more respect for Content Strategists — they’re innately great IA practitioners
  • Our model is only good if it actually follows a methodology — that’s where the survey researcher came in
  • Data Scientists are a lot like designers — they iterate on their models, massaging numbers instead of pixels in the UI

--

--

Jonathan Remulla
Curiosity by Design

Product Design - Practitioner, Educator, Leader @ Supafrenz. Ex - Meta, SurveyMonkey, Ancestry