The two statistical tools that are worth paying more attention to

Data analysis is core to almost any role in the knowledge economy. We either do the analysis ourselves, on data that we collected, or try to learn from analysis done by others.

In both cases we are looking for insights — trying to find meaningful patterns in the data that have some functional use — for example: supporting a business decisions.

Many of us are able to tell apart an average from a mean, and have the similar image in our heads of a semi-shaded bell curve, when someone talks about a standard deviation.

But at least in my own case, it seems that I have left behind any functional understanding of deeper statistical concepts somewhere in my second year of undergrad. That is, until some well-intentioned past colleagues have taught me to know better. I’ve been grateful for them ever since.

Of a slightly more extensive list of tools and concepts, I’ve picked two that I view as absolutely critical to master, if we want to have any real chance of success in our quest to discern signal from noise:

Confidence Interval— measurement is an imperfect process. Focusing solely on point estimates coming out of a measurement exercise (average, mean, etc.) typically creates a false sense of certainty in the accuracy of those estimates. Confidence intervals show the reliability of point estimates, by specifying a range within which the parameter is estimated to lie, given a required confidence level (typically 90%, 95%, 99%).

Two-sample t-test — behind the scary name lies a critical ideal. Comparative analysis is an essential part of almost every form of data analysis. Often times we care less about an absolute computed value in isolation, and more about understanding its significance in the context of other values. Be it measurement of the same parameter in a different time, in a different population or a different parameter altogether. The two-sample t-test enables us to determine, under a given required confidence level, whether two parameters are indeed different from one another.

Let’s connect these tools with more real world examples:

Suppose, that for whatever reason we’ve decided to run an “employee engagement survey” to regularly measure our employee engagement scores (why that’s a bad idea is a topic for a different time). It’s an overly-simplified survey where we simply ask employees to rate their engagement on a scale between 1 and 7.

We’ve decided that if, on average, we score less than 5 — it means we have an engagement problem and further action must be taken. The survey results came back and the average was 4.8. Should we drop whatever we’re doing right now and start figuring out ways to solve our engagement problem? Well, that depends. Depends on what? Depends on the reliability of that point estimate. How might we go about assessing that? Confidence Interval!

We also decided to slice the data by department. While the average score for the marketing department was 5.8, the average score for the engineering department was 5.3. Wow! This seems like a serious problem. Should we drop whatever we’re doing right now and start figuring out what’s going on in engineering that their engagement score is so much lower than marketing’s? Well, that depends. Depends on what? Depends on whether those two averages are indeed different. How might we go about assessing that? A two-sample t-test!