2018: The year data started working for you
Data science, machine learning, artificial intelligence.
These are just some of the most frequently used words in business in 2017. For example, Google Trends reveals that “machine learning” and “data science” are searched for 400% more in 2017 than they were in 2004.
“Every second of every day, our senses bring in way too much data than we can possibly process in our brains.” — Peter Diamandis, Chairman/CEO, X-Prize Foundation
But, just because we’re bystanders in a technological revolution doesn’t mean we’re going to reap the benefits of it if we just sit on the sidelines. We’ve got to engage data and do something with it.
Why data, especially now, matters
Information, while not sufficient, is important for making prudent decisions. Our emotions are fleeting and easily swayed — so “going with our gut” is often a recipe for disaster.
Firms have the opportunity to collect such a vast amount of data these days, and store it with virtually zero marginal cost.
For example, if you’re a project manager, you’re almost surely keeping track of client work and the details of each contract. Are you also keeping track of information about what went well in the project, what didn’t, who was involved in it, and how long it took? Now, what about your vendors? Are you keeping track of all your purchasing decisions?
These pieces of information might not be incredibly useful on their own, but together they can yield an array of insights. For example, research by Erik Brynjolfsson and Kristina McElheran show in a sample of manufacturing plants that the incidence of data-driven decision-making tripled from 11% in 2005 to 30% in 2010 and the adoption of these practices was associated with a 3% increase in productivity with even larger gains among plants that had more information technology and skilled workers.
“The whole is greater than the sum of its parts.” — Aristotle
Data-driven decision-making allows you to pinpoint patterns and develop best practices so that you can cut out the behaviors or mistakes that are impeding your value creation.
Step 1: Gathering the data
Infrastructure is important. Sometimes organizations get stuck in a rut with an outdated software or a platform that was built to suit the needs of a particular client base or market that has since then changed.
In these cases, there is a very real challenge that cannot be resolved overnight. However, even then, you can still devise simple experiments to help discern what works and what doesn’t, and what ideas might hold promising new business opportunities.
More often than not, however, your data architecture is sufficiently robust to allow integration of different pieces of information either across or within business units. What’s critical is that you organize the different pieces of information in a way that’s comparable across units and over time.
So, if you are measuring project performance, you want to maintain a common metric that you can analyze over time and/or space — otherwise you’ll be stuck comparing apples and oranges.
Step 2: Exploring the data and formulating the question
While in research one frequently has to first formulate a question before finding the right data — since different research questions require different data demands — the process is often reversed in an organizational setting. You’ve already got the data, but you are not necessarily sure how it can be used to produce actionable insights.
Given access to the type of information discussed in step 1, then you just need to import the information into any run-of-the-mill statistical package. R is one of the most commonly used packages, for example — and it’s free & open access.
Now, this is the creative part. You should use your experience to discipline your use of the data. That’s what Kathryn Shaw effectively terms “insider econometrics”: using what you know about the organization (or working with others in it) to make more informed and useful statistical models.
Although artificial intelligence has come a long way at pattern recognition, there is no substitute for discernment and experience in human decision-making. So, ask yourself: “What are the outcomes of interest for you and your broader organizational unit?”
Perhaps it’s revenue, but perhaps it’s employee engagement. Your priorities might shift over time.
“The goal is to turn data into information, and information into insight.” — Carly Fiorina, Former CEO at the Hewlett-Packard Company
Step 3: Model selection and analysis
This is the hardest step since it requires expertise in not only statistics and/or computer science, but also causal inference.
Basic statistical regression analysis can go a long way in revealing incredible insights and patterns from the data. But, just because you see two things moving together doesn’t mean you’ve identified actionable intelligence. Indeed, as the saying goes, “correlation does not imply causation”.
Let’s take the instance of employee engagement. In an academic partnership with Payscale, I used their crowdsourcing platform to relate measures of corporate culture with compensation and other employee characteristics to recover a “value” for employees monetarily value culture.
However, but the fundamental statistical challenge is that more productive individuals are likely to get offers from companies that have simultaneously better financial offers and non-financial amenities. Put simply, standard methods might spuriously suggest a relationship between compensation and corporate culture simply because more productive individuals are sorting into those jobs.
While my solution to the statistical problem can be found in the full working paper, the point here is simply that you’ve got to be aware that there is more to the story in the patterns you identify: potential unobserved variables, mis-measurement in the data, and bi-directionality between the input and output variables, just to name a few.
“Without big data analytics, companies are blind and deaf, wandering out onto the web like deer on a freeway.” — Geoffrey Moore, Consultant
Beyond business intelligence
Although finding ways to save on money and create better value for your clients are all important, data can also be used to make better hiring decisions. Hiring is important because it not only determines your ability to actually meet client needs, but also influences the way your entire team and organization communicates with one another and solves problems.
Indeed, human capital is the most important type of capital and individuals who are either not on board with the vision or bad apples more generally can stifle an otherwise impactful enterprise.
“Employees are a company’s greatest asset — they’re your competitive advantage. You want to attract and retain the best; provide them with encouragement, stimulus, and make them feel that they are an integral part of the company’s mission.”— Anne M. Mulcahy, Former CEO at the Xerox Corporation
Finding the optimal turnover rate is, therefore important. You want some since your organization needs a way to filter out people who are let in and turn out not to work well, but you also don’t want too much since it distracts from client work and makes employees worry about their jobs.
Now, assuming your human resource team maintains good records of both current and former employees, then you can relate employee characteristics with the probability that an individual leaves.
And, what’s incredible is that you’re not limited to simple measurements, such as gender and education, but rather a whole set of performance metrics and even measures of sentiment based off of employee writing (e.g., emails). For example, natural language processing techniques are now sufficiently advanced that you could feed in a text document with pre-defined key words and filters to measure an individual’s level of aggression.
My friend, Bo Cowgill, at Columbia Business School implemented an experiment precisely along these lines. Specifically, he “trained an algorithm” on historical employee and turnover data, including information from their resumes, to determine how good algorithms are at making hiring decisions, rather than relying purely on managerial discretion.
Perhaps surprisingly, the applicants chosen by the algorithm were more likely to pass their first-round interview, accept the job if they were offered one, and produce more once hired than their counterparts who went through the standard process.
That doesn’t mean we should relegate all decision-making to algorithms. Rather, we should know how to interact and use data to make better decisions that increase the size of the pie for everyone.
So, don’t be afraid of data in the new year — make it work for you and your organization!