Survey Responses Gone Bad

Mike Zawitkowski
Acorn Analytics Blog
5 min readNov 12, 2016

--

A screenshot showing a glimpse of data in a real employee engagement survey.

Do you think all the responses in that survey of yours are equal in value?

Think again.

Check out the picture above. The picture is mock up based on real survey data that my company Acorn Analytics was hired to evaluate. I wasn’t actually looking for comment data, but making my way to the last row in an Excel document of sample data, where a colleague had added some statistics we needed to review. However, the “Nobody fuc..” caught my eye. Naturally, I decided to investigate.

In case it’s not clear, here’s the backstory. A company we’ll refer to as Acme sent out an annual engagement survey to its employees. Most of the survey was numerical data, for instance Likert Scale items where a respondent must rate a statement on a scale of strongly negative to strongly positive, or on a “scale of 1 to 10…” In this particular survey, however, employees were given multiple opportunities throughout to provide “additional comments” in an open-ended format.

Reading through the rest of the responses by this one person, it was apparent in both the open-ended responses and the numerical data that the data was not very valuable. The pattern of responses suggested a lack of thoughtfulness that was troubling. Here are more comments from this same individual:

“That’s OK. Nobody fucking cares.”

“For question 7, you’ve meant reading online comic strips, right?”

“And that’s great, because they can concentrate on making the motherfucking code.”

“And I don’t fucking care.”

“I could use a raise.”

Based on these comments alone, how would you rate the value of the rest of this person’s responses in terms of value? How representative do you think this person’s response will be with other responses? Should a company include this individual’s data when you used the engagement survey to make an important strategic decision?

I’m not claiming that this person’s comments are completely without value. If there were more people who responded in a similar way, then the argument could be made that this person’s opinion is representative of a sub-culture in your organization.

If this individual had been actually providing some answers that we cared about, he (or she) could have dropped as many f bombs as he (or she) could write and the response would have been included. What was clear was that this was a bad response.

Cursing and providing a negative response isn’t what devalues a respondent. A bad response is when a respondent takes a survey and provides data that at best is partially suspect, and at worst is only serving to mask the truth from the culture that is trying to make things better.

Treating Surveys As Sacred

The data in engagement surveys is incredibly valuable. It is one of the few opportunities a company has to get the truth from a very important group of stakeholders. Your employees are on the front lines, so they know what’s really happening in the trenches. Your employees are also the ones responsible for executing whatever plans you come up with at the conference room table. You cannot run the company without their support. Your employees literally ARE your company.

At the same time, let’s acknowledge that not everyone in your company thinks that surveys are important. Many in fact will think it is a waste of time, or at least time better spent writing code, or selling to customers.

This is why for every thousand responses you receive in a company-wide employee survey, at least one response is better omitted from the aggregate statistics, in order to give you a more accurate picture. In most cases this person’s feedback will still be represented elsewhere, for instance looking at issues within a particular department, or highlighting a sentiment that is trending in certain parts of the organization. What we probably won’t do is include this person when we roll up the results at a higher level, using metrics like the total mean score.

Finding the Bad Apples

Now that we’ve established that bad responses do exist, what do we do about it? How do we go about finding them? Sometimes a bad response is easy to spot, like in this case. I happened to glance at an employee survey the other day and my eye caught just a truncated fragment of a comment in a data point. However, you won’t have time or patience to go through hundreds or thousands of comments individually. You certainly don’t have time to analyze the Likert Scale responses by hand in order to figure out who to reject. Fortunately, computers are awesome at stuff like this. Plus, when you have a starting point like this individual from Acme, we can conceptualize a model and set loose the computers to round up some suspects for us to interrogate.

There are a lot of ways to approach filtering out the noise, and you’ll probably use a combination of techniques. (If you are interested in a review of such things, let me know with a comment.) The point of this article is that you do something to clean up your data. Don’t let the bad apples muddy up your results.

Otherwise ask yourself: if the accuracy isn’t important then why are you even conducting a survey in the first place?

If you are in charge of a dataset like this employee engagement survey, and you are not finding at least one person to reject out of every thousand, you are doing something wrong. On your next survey, my advice is that you try to find at least one response to throw out. Explore statistically why this person could be skewing your data, and decide how to handle it for the entire analysis.

If you do this, your work will be rewarded. Not only will your dataset be cleaner, but you will understand the nuances of the data in a more sophisticated way. This make you and the leaders of your organization that much more confident in how you use this data to make decisions.

--

--