Adopting Data Science Solutions for Business: Balancing Complexity, Accuracy and Interpretability

Published in

GAMMA — Part of BCG X

11 min readMay 28, 2019

The use of data science across companies and industries worldwide ranges from non-existent to advanced. Even inside companies that actively use data science, the way it is applied can vary dramatically. Many top tech companies rely heavily on data science in their day-to-day operations, while other more traditional companies such as grocery chains, CPG, or IG companies focus more heavily on human capital. And within all of these companies, some functions such as sales rely more on qualitative approaches, while teams such as IT or scheduling and revenue management for airlines may focus much more data-driven decision making.

Bringing data science to clients always presents very interesting challenges, not the least of which is the notion of complexity and accuracy versus interpretability. Depending on where they are on their journey to advanced implementations of data science, clients may be more or less willing to sacrifice some measure of complexity for the sake of clarity or understandability. On the low-complexity end of the scale, you run into the challenge of whether overly simple models actually bring value to your client. On the high complexity side of the scale, users often are confronted with “black box” issues whereby the complexity of the models makes it impossible to interpret and understand what happens between the inputs and the outputs. This “black box” concept raises some interesting human challenges: How much do you trust your models? At what point do you lose credibility? How do you help support and validate your model’s recommendations to your business users? In the end, it all boils to down a single and critical success metric: adoption. The most significant challenge is oftentimes balancing complexity and accuracy with interpretability to achieve adoption, and then managing the side effects of striking this balance.

As the data science arm of BCG, BCG GAMMA is often sought out to devise data-driven solutions for clients in very specific fields. We must always find ways to balance each client’s technical acumen, expectations and capabilities with our own desire to deliver an optimal solution. In this post, I would like to discuss how to think through this balancing act while working with clients to help them solve specific problems, as well as to share a number of specific examples.

Technical acumen and expectations

Over the past decade, data science has taken on a prominent role in business. Google, Amazon, Apple and Microsoft use it to develop very visible applications such as digital assistants. Uber, Lyft, Didi and others use it to create ride-sharing applications. And companies such as Waze use it to develop route-guidance optimization applications. The increasing availability of data, the lower cost of sensors and IoT solutions, and the increase in computation power have all made data science applications like these possible.

Given the success of these applications, many companies are looking to data science to improve their top- and bottom-line performance. The potential applications are endless. On one end of the spectrum are churn-modeling tools such as defection detection, or prediction engines to calculate when customers might leave your company and move to a competitor. At the other extreme are financial-modeling applications, HR solutions, and machine-failure prediction tools. But no matter what the intended application, the question I often have to grapple with is whether our potential clients are looking to data science as a solution looking for a problem, or truly looking to solve an actual, day-to-day business challenge.

I recently engaged with a large retail bank that wanted to improve its product pricing. The bank’s goal was to provide its customers with the best possible price so it would have a profitable operation, while growing and shielding itself from potentially revenue-negative customers. In the course of my interactions with the bank’s leadership team, it became clear that they were trying to achieve two very different goals:

Optimize their pricing
Introduce “machine-learning” applications at the bank

It didn’t take long to discover that the second goal was in conflict with the first. Why? Because machine learning is a field that lends itself to a specific set of applications. Furthermore, while the term “machine learning” encompasses a large class of models tasked with making inferences from data, it is often confused with very specific applications, such as random forest classifiers or ensemble models.

The bank’s leadership team was looking for a solution that would leverage this latter class of models. Our assessment was that, for this particular task, a time-series approach rather than an ensemble model was better suited to their needs. I have found that, even in use cases where ensemble models could theoretically be used, a more traditional logistic regression approach (which, by the way, is also machine-learning) could behave much better.

This dichotomy between expectations and model performance quickly turned the process of optimizing the bank’s pricing into an uphill battle. To get beyond this impasse, our team decided to place more emphasis on educating our client team about how each type of model applies to different situations, and why the use of time-series solutions or logistic regressions was better suited to their specific needs.

Solving the problem — then, explaining the solution

In another example, I recently worked with an energy company that was trying — among other efforts — to improve its customer retention at the end of contracted service periods. Our team’s analysis of the problem led to the use of a classical random forest classifier, a machine-learning approach that leverages ensemble models. In this specific instance, our client’s other goals did not conflict with this specific one, so we were able to get quick buy-in to our approach and algorithm selection. To do so, we chose to fully leverage the power of random forest classifiers and, in fact, used more than 400 features in the model. We followed the usual approach to building such machine learning models by training the model on numerous data subsets, followed by validation on test sets. We also performed forward and backward tests to prevent seasonal issues from affecting the quality of our models. In the end, we landed on a very accurate model to predict churn.

We faced another challenge when it was time to explain the model to our client counterparts. We had difficulty explaining the methodology in ways that could be understood by a team at the very beginning of its data science journey. Eventually, we were able to get buy-in from the client team, but the “black boxiness” of the model created additional confusion when the team asked us to provide traditional regression metrics and parameter estimate values for evaluation. Obviously, these ensemble models do not output such parameters, so we could not provide them to the client. Instead, we relied on partial-dependency charts to demonstrate some of the relationships.

This communication challenge highlights what can happen when a company moves quickly from little or no use of data science to advanced machine learning solutions. It further suggests that, in this case at any rate, it might have been more prudent — and probably just as effective — to propose a more traditional approach to data science solutions. Had we used a more interpretable logistic-regression solution to this churn model, we might have been able to bring the client along the data science journey faster by providing them with intermediate anchor points in the data.

When implementing data science solutions, it is important to take into consideration both the client’s underlying goals for the solution — and their ability to understand and relate to the solution you are working on. In the first example I shared, a large part of the effort centered on first educating the client about the value of the solutions we were proposing, and then on applying the right solution to the problem. In the second example, had we been quicker to recognize where our client was in their data science journey, we could have taken a slightly different direction that would have solved their immediate problem — and allowed us to gradually enhance their data science skills.

The strength of the foundational data — and of the business users

Another area of great importance when considering data science solutions is the state of a client’s data environment, processes, and tool sets. If the client does not have good data available, then you will not be able to build a successful model. This may sound incredibly obvious, but it is not always apparent to internal users. For example, a leading truck manufacturer asked us to help improve its pricing and sales approaches by leveraging internal data along with some external, commercially available data. We quickly found that the quality of the in-house data was much worse than we anticipated. The main problem was that instead of using unique customer identifiers, the company relied on manual alphanumeric entries for each customer. This meant that a single customer who made ten purchases could have ten different name entries, with variations to each entry. If the variations were slight, we could fuzzy match their names relatively easily. However, in many instances the names were completely different from one transaction to the next. Unfortunately, similar issues also existed in the external commercial data the client had purchased. These kinds of data issues are not new, but they must be evaluated early in the process so that everyone understands from the start what is and is not feasible.

Another critical piece of data science adoption revolves around the actual users of the solutions created by data science teams. This takes us back to the black box solution I mentioned earlier. When I worked in the Revenue Management (RM) department of a major airline, we leveraged a variety of data science tools developed in-house or by outside vendors. The users of these tools (that is to say the members of the RM team I was on) were generally highly educated and advanced users. Their degree of skill was due, in part, to the historical legacy of RM within the airlines and to the fact that the group was an entry point for the airline’s future talent. As a result, the team had access to very detailed training programs with thorough explanations of the algorithms and methods used in revenue management. This preparation gave the RM team members enough confidence in the systems to understand their outputs and be able to debug 80–90% of their questions regarding counterintuitive outputs.

As the field of revenue management continues to evolve, however, it is becoming increasingly difficult to debug some algorithms. For example, Bayesian forecasting, which relies on continuous updates of the model parameters as new data comes in, can be very difficult to interpret when counterintuitive results appear. As a rule, though, RM teams across the industry are sufficiently trained to understand the systems they use every day, which has allowed for unparalleled adoption of RM solutions — at least in this sector of the economy.

For the truck manufacturing sales team, on the other hand, we devised a slightly more rudimentary solution. The solution leveraged a smaller number of parameters and allowed us to communicate our approach to the entire sales teams in more simple terms — but still retained sufficient accuracy and complexity to make the results meaningful. This transparency was critical to our ability to get buy-in from the sales team — the primary users of the tool. What we absolutely wanted to avoid was that a single outlier recommendation might make a team member think that the tool was unreliable and outputted “bad” recommendations. To that specific end, we conducted group training sessions to share the model parameters with the entire sales team, and invited their comments, feedback and thoughts on the impact of these parameters. We also made sure the data relationships were intuitive enough to be easily understood by everyone on the team. And as we approached the end of the engagement, we followed up with one-on-one meetings with team members to address any lingering questions or doubts.

This approach drove significant team adoption, but despite our best efforts there were a few hold outs — team members who felt uncomfortable with our approach and chose not to use the tool. They made this decision despite the strong mandate from their own leadership team to use the tool, and clear performance improvement among other team members who did so.

These examples illustrate the importance of calibrating your solution to your user group. Only by taking this step can you hope to get widespread adoption of your solution. After all, what good is a superb algorithmic solution if nobody uses it because they don’t understand it or trust its outputs?

There is more to good data science than good algorithms

My experience implementing data science solutions for numerous clients has taught me that the following items are key to ensuring the success of your solutions and implementations:

1. Know your client’s goals and aspirations, as well as their technical ability — and their reasons for wanting to implement data science solutions in the first place.

2. Understand what is technically feasible and applicable given user capabilities.

Once you have a clear view of these two key aspects, you may need to help your clients align their goals with their capabilities. Once they are aligned, though, you should be able to set a clear path for the use of data science within the company. In the end, it’s all about client adoption of your solution. This is a matter of balancing complexity and your desire for an optimum solution, with the realities of what you will be able to actually implement — and with what users will be able to understand and adopt, given their current environment and capabilities.

I want to leave you with one final example of such tradeoffs. During my time at a leading enterprise-pricing software provider, the company’s science team created a wonderful pricing-optimization solution based on a very detailed segmentation approach. When this solution was first released in the early 2000s, it was significantly ahead of its time. However, once the tool was in market, the science team realized that while there was nothing wrong with the solution, it was creating significant user adoption challenges: It was just too complex to be well understood by the majority of clients, most of whom were just embarking on their pricing journeys. To support user needs, the team simplified the solution to speed adoption and then, over time, gradually added technical complexity back into the tool as clients became more educated in the art and science of pricing.

The Bell Curve of Data Science Adoption

When I think about the relationship between adoption and complexity and interpretability, I often visualize the chart shown below. At the far left, the graph shows that overly simple models will result in little adoption: Nobody wants to use a demand forecast that consistently forecasts the average of historical data. As you move to the right across the graph, the complexity increases (so, consequently, does the value of your solution), and there is an exponential increase in adoption — with a small amount of variability around the mean adoption (orange line). As you move farther to the right, complexity increases to the point that it starts hurting mean adoption rates. Along with this decrease in adoption, a large increase in variability around the mean appears, which reflects the willingness of more advanced companies to adopt complex solutions. You will notice that I have not included scales on either axis of this chart, since these values can be very fluid as businesses become more accustomed to data science solutions. Furthermore, there is no single value that applies to any given business.

Where your next client falls along this curve will depend on their own journey to data science, their goals, and how well you are able to help ease them down the path. Ultimately, the calibration of the axes will depend on your client’s technical acumen and the strength of the foundational data — and on your ability to strike the right balance between complexity and understandability.