Business Problem Solving with Data Science — Framing the Business Problem — 1 of 3

Shweta Doshi
15 min readJan 19, 2020

--

One problem I face when talking to a lot of aspirant data scientists is their focus on machine learning techniques, rather than the Business problems that Data science can solve. The CXO doesnt care for your accuracy score metric, he really wants some very different kind of problems to be solved with Data Science for his company! Here is an attempt to help people think about Business problem Solving with Data Science in a structured way!

Thanks to Dileep Karri for his contribution on this post!

Lets first understand a Business Problem

An international restaurant aggregator company, YumEats! has decided to expand their operations and wished to enter the Indian sub- continent. YumEats! allows the users to select food from restaurants and order to their homes. Like every aggregator in the space, there are mainly 2 channels- B2C and B2B. The B2C revenue comes from a special rewards program or exclusive membership program that would allow the consumers to prescribe to a monthly or annual membership. The B2B revenue comes from collecting a portion of the fee per order collected as commission from the restaurants. Hence, the company has to acquire both consumers as well as other businesses.

YumEats! has set up a data science wing in their company and hired you as a data scientist on the team. They want you to analyse data and help the company expand in terms of both growth and revenue. This would be possible with both more Daily active users on the platform as well as lots of restaurants with a wide variety of cuisines that would further drive traffic to the app. They have already run a whirlwind campaign with deep discounts that are piling the losses and putting a lot of strain. You have been given a mandate — to inform where to cut the spends with minimal damage and also simultaneously find ways to increase the revenue stream with data driven insights.

We will come back to this problem in a while, but let’s first understand where you are right now in your journey to become a Data Scientist

*

You know the basics of Machine Learning and understand the basic nuts and bolts of the algorithms. Great! Awesome! But before you apply the algorithm, you need data. And before you collect data, you need to be clear about the actual business problem you are solving. At the end, businesses look to data science teams to give insights and help solve problems. The journey from a business problem to a data science problem is not so straightforward, and hence in the next posts, I will make an attempt to demystify the process. The process of constructing a data science solution to a business problem is often represented as the following path

While I will focus a lot on how to define the problem and setting the objectives in this post, we will also briefly touch upon the other steps and show how to solve a business problem. The first step of the path — defining the problem — contains tasks such as understanding business needs, scoping a solution, and planning the analysis. However, while translating a business problem into a data science model is a process, it is not linear. Each step in the process usually needs to be revisited multiple times in order to arrive at an analytically sound, maintainable, and scalable solution. The “define the problem” box in the simple linear diagram above can actually be exploded into a much more nuanced process:

It is very common to initially define a business need, but then, as you proceed to more fully scope the problem, realize that an entirely different need is more pressing. Likewise, it is common to scope a solution, only to realize later that data access limitations or engineering constraints require a change in that scope. Often, these changes in plans will even occur after you think you have left the “define the problem” stage of the process. For example, it is common for model tuning issues to raise scalability concerns which may require a substantial re-evaluation of what problems you are trying to solve and how you plan to solve them.A large amount of a data scientist’s work takes place away from the computer: data scientists must work with non-technical co-workers to define the goals and scope of their projects. A well-understood business problem and a well- designed plan of action will lead to better results, less wasted effort, and happier stakeholders.

Let us take a simple example to look at each of the steps that we mentioned in our diagram. The problem we are looking at ‘to increase the number of retweets a tweet would get’

1. Define the problem

Predict the number of retweets a new tweet would get.

2. Set the objectives

Identify the features that would influence the number of retweets.

3. Prepare the data

Trends corresponding to the hashtag accompanying the tweet would be a good signal.

4. Build and Train the model

How accurate is the model? How much of an error can the business tolerate?

5. Make predictions and fine-tune

The initial predictions were quite off. Think of further refinements. Can this problem be solved? What are the barriers for an effective solution — data? computation?

How do we arrive at the answer to these questions in a systematic manner? Let us understand through a case study.

Now that I have set the stage, back to our original case study

*

An international restaurant aggregator company, YumEats! has decided to expand their operations and wished to enter the Indian sub- continent. YumEats! allows the users to select food from restaurants and order to their homes. Like every aggregator in the space, there are mainly 2 channels- B2C and B2B. The B2C revenue comes from a special rewards program or exclusive membership program that would allow the consumers to prescribe to a monthly or annual membership. The B2B revenue comes from collecting a portion of the fee per order collected as commission from the restaurants. Hence, the company has to acquire both consumers as well as other businesses.

YumEats! has set up a data science wing in their company and hired you as a data scientist on the team. They want you to analyse data and help the company expand in terms of both growth and revenue. This would be possible with both more Daily active users on the platform as well as lots of restaurants with a wide variety of cuisines that would further drive traffic to the app. They have already run a whirlwind campaign with deep discounts that are piling the losses and putting a lot of strain. You have been given a mandate — to inform where to cut the spends with minimal damage and also simultaneously find ways to increase the revenue stream with data driven insights.

The above problem is a business problem that is presented in the most raw form — which is to somehow improve revenue and cut losses by analysing the data. Before any form of analysis can be performed, a thorough understanding of the problem is required. To understand the problem in its entirety, you first need to talk to the respective stakeholders.

Data scientists always work with stakeholders. Stakeholders are people who have a say in how the business operates and in what goals the business needs to prioritize. Stakeholders could be managers and executives, but they could also be individual contributors who have responsibilities over specific aspects of marketing, engineering, sales, finance, operations, or any of the facets of a business enterprise. Different stakeholders have different requirements.

For example, in the case study just presented, because of the direct revenue impacts and cuts on spending needed, the stakeholder could be a leadership level person on the Marketing Team or Sales Team CMO or Director, Sales). Instead, if it was a product manager, then the person is more likely to be focused on customer-facing features. A stakeholder who is an engineering manager is more likely to be concerned with the maintainability of a product, and aim to minimize the extent to which changes will create unanticipated work for his or her team. An executive stakeholder (like the one we are working with) is more likely to be focused on the “bottom line” — he or she won’t care too much about the product’s maintainability or about specific features as long as he or she can be assured that revenue will increase, or a client’s business will be retained, or expenses can be cut.

When faced with a scenario like the one given above, the first instinct of most data scientists is to consider different methods they could use to achieve the desired result. That is almost always the wrong reaction to this kind of situation. Here are some of the things to consider:

1. The stakeholders have presented a very vague problem statement.

Improving revenue and cutting costs is too wide a problem that is presented. There is a need to narrow down this problem.

2. The stakeholders are the ones who will be consuming the insights you

present and drive the required change. Understand the constraints (if any) that you need to operate under. Align with their agenda and make your objective as close as possible to theirs.

Most of the time, however, people who ask for data science help do not know how to ask questions in a way that are data-science ready. While most data scientists are used to thinking about analysis in terms of method, features and variables, and data transformations, most stakeholders are used to thinking about analysis in terms of spreadsheets or other tools they are familiar with. When they confront a business problem, they think “how would I solve this if I had to do it myself?” They are asking the best question they know how to ask, but that question needs to be translated and shaped into something a data science can act on. For YumEats!, the CxOs are asking the right questions -

How to increase revenue for the company?

How to cut spending and reduce the losses and move towards profitability?

As a data scientist you must translate these questions into actual data science problems that need to be solved. When asked to use “data science” to solve a problem, your first task is to think of ways you can ensure that you understand the problem. You need to meet their problem on their terms, not yours.

Always remember right direction is more important than speed. Huge progress in a completely wrong direction makes no sense.

Spend enough time to understand the problem and validate if it is the right direction to pursue.

Frame the business problem

For the broad questions that were mentioned previously, there could be a lot of sub-problems that could be solved. Talking to the stakeholders and brainstorming with them has led to these possible problems that can be solved.

Increase Revenue

Conduct campaigns to acquire new users, understand user profiles to deliver customized campaigns.

Predict when the users are likely to order next from the different trends and nudge them to order.

Recommend restaurants to the users to order from next, based on the previous orders of the users.

Perform existing customer segmentation and run a tiered campaign — budget restaurant recommendations for low paying users, premium restaurants from high paying users.

Identify good restaurants that are not yet onboarded on the app to increase the roster.

Provide recommendations to new onboarded restaurants on cuisine preferences of the people in locality and give on-the-fly menu suggestions based on trends.

Automatically check the image qualities of restaurant and food photos and enhance the low quality photos to drive traffic.

Reduce costs

Understand which customers are likely to leave the platform and only provide discounts to retain them. Only provide discounts if someone is at risk to leave the platform)

Automatically identify orders from nearby localities and assign it to the same delivery person.

Automatically identify fraudulent orders beforehand, and trigger orders only upon verification.

These problems have further narrowed the scope on the different kinds of the problems that can be solved here. Here is one thing to avoid while talking to a stakeholder. As a data scientist, you use a specialized vocabulary to describe your work and the results of that work. Some people, such as engineers, will understand some of your vocabulary. But most people (especially the stakeholders) will understand practically none of it. If you start talking about gradient boosting trees or k-means clustering, you will at best get blank stares and sometimes you might even encounter hostility.

Similarly, people who work on marketing, product management, or an executive team will speak about their work in terms that are not quite clear to you. To understand business stakeholders on their own terms, learn to ask good follow-up questions. A good follow-up question encourages stakeholders to illustrate what they mean without realizing that that is what they are doing. In the case study, some good follow-up questions that could lead to above sub-problems could be:

“You’ve talked about increasing revenue. What were some of the initiatives for increasing revenue? What are the major painpoints that you experienced that led to loss of revenue?

“In your rich experience of market research, what is it that the user is looking for? How can we improve the user experience on the app?”

“You might have seen users that have left the platform. What are the main reasons for leaving the app?”

Notice that none of the previous questions had a “straightforward” answer. Asking these questions helps stakeholders clarify their thinking, gives you additional concrete examples of what the problem looks like, and attempts to elicit information that might help scope the problem.

Here are some guidelines for asking clarifying questions that can get your stakeholder to give you more useful details:

Get concrete as fast as you can. If a stakeholder talks about “what user want”, ask them to tell you about one specific user or a segment of users. People are natural storytellers: let them explain a problem in story form you will get less distorted (though not necessary less biased) information. For example, you might ask, “Can you walk me through the behaviours of some of the high frequency users of the YumEats app?”

Focus on pain points. Find out what users are trying to do with the product. Find out the pain points and if the problem you are thinking can alleviate the pain points. Prioritize the pain points to in turn prioritize the data science problems that might ease out these pain points.

Look for opposites. If we received inputs from stakeholders on users who were unhappy with the app, try to gauge information from the other group too — users who were happy with the app. This will help build a balanced perspective. Look for behaviors that must be rewarded and the ones that need to be changed (identify biases)

Find hidden problems. The problems someone asks you to solve may not always be the most pressing problem. Look for problems that stakeholders mention incidentally as they tell about what they think is their main problem. For example, the stakeholder pointed out that the users spend quite a bit of time on the platform without ordering. It may be worth it to ask: “Can we improve the ranking of recommendations to align with the user’s interests and push them to order soon?”

Asking clarifying questions serves several purposes:

1. It demonstrates that what is important to your stakeholders is important to you, too. It builds trust and establishes rapport, which are two things you will need when it comes time to share the results of your work.

2. It fleshes out your understanding of the problem. It is easy to assume thatyou understand what people want. It is much better to take a little extra time to reduce the possibility that you misunderstand.

3. It forces stakeholders to confront some of the complexities of the problem they are asking you to solve. It is easy for a stakeholder to assume that a problem is easy to address because it is easy for them to talk about. By asking clarifying questions, you force them to consider contradictions and nuances in their story.

Exercise — Refining the problem

From the previous exercise, you would have identified the stakeholder for your problem. As next steps, think of customer retention as a large problem and then break it down into different subproblems. Think of different ways to retain customers. Take cues from any of such apps you have been using or talk to friends who regularly use such apps. List down the different ways or strategies to retain customers just like we have done above. After listing down the different subproblems, next list down the different clarifying questions you would ask the stakeholders. Ask your friend to assume the role of a stakeholder and pose your clarifying questions and jot down the answers.

Plan for decisions, not findings

As data scientists, we often think about the results we produce in terms of findings: we conduct an analysis, validate certain results, and those results say something useful about how the business is doing or what the business should do next. For non-technical stakeholders, however, findings are almost always irrelevant. Stakeholders need to make decisions, and you should never assume that you fully understand what those decisions are, and you should definitely never assume that stakeholders will be able to naturally map your findings to their decisions. Consider the following questions related to the case study:

“Which users must be given the discounts to stay back on the app and when to trigger them?”

“To a new user who has just landed on the app, what is the right campaign to show?”

“For a new restaurant onboarded on the app, which dishes must be highlighted on the menu?”

All of the above questions are designed to elicit information about decisions. A stakeholder needs to make a decision. Any findings you produce should help them make those decisions.

Notice that these questions point to the basic “who”, “what”, “where”, “when” and “why” of how the app uses the data driven insights. Asking these questions helps you create a map of decisions and outcomes that will need to be considered when implementing the solution you eventually develop.

Here are some guidelines for mapping out the relevant decisions, which your stakeholder has just more alluded to than defined, in a more formal and explicit way:

Understand timing — People have to make decisions at certain times, within certain timeframes, and on certain schedules. For example, you could ask, “When should a particular campaign be shown for a user to ensure maximum conversion?”

Understand expectations. Set the expectations clear upfront. Clarify the timelines for the data science solutions to start showing results.

Understand downstream effects. Even though one stakeholder might ask your for a solution, they might not be the only person impacted by what you deliver. For example, discounts might make user happy but the restaurant partner might be worried about the impact on offline business. The respective B2B sales director must be aware of the impact.

Understand when the business problem isn’t a data science problem. The most important thing to realize is that not all business problems can or should be addressed through data science. Make sure you feel confident that the problem is solvable in principle, and that using data science to solve it is the most cost-effective way to go.

Most importantly analyze among all the possible data science problems to solve what would be the ideal sweet spot which would lead to

Quantifiable impact for users — increase in daily active users, increase in the daily orders

Quantifiable impact for stakeholders — increase in revenue with lesser costs

By now it must have become clear that increasing revenue while cutting costs is a tricky proposition for they seem to contradict each other. To increase revenue, you need more DAUs which would be acquired by various campaigns that would incur further cost. After a lot of deliberation with the stakeholders, you now get to the actual business problem that you like to solve.Identify the good restaurants to be targeted for onboarding onto the app.

With this problem, you are trying to improve the choices of good restaurants to a user and improve the quality of restaurant partners on app. In the next chapter, we will talk about scoping the solution to this problem.

Exercise for YOU— Identify ONE problem to solve.

From the different subproblems that you are listed, shortlist the problems where you feel data science can make an impact or solve the problem. Go back to the workflow for applying Machine Learning covered in Introduction to Data Science…). Next, think about what decision can the stakeholders take on solving the shortlisted subproblems. Remember that we are focused more on decisions and less on findings). Further, analyze what could be the impact of solving each of the shortlisted subproblems. Discuss and arrive on ONE problem statement that you would like to tackle.

More on problem-solving for YumEats in the next post! https://medium.com/@shwedoshi/business-problem-solving-with-data-science-scope-your-solution-post-2-of-3-693ab4b568ba

--

--

Shweta Doshi

I am an unapologetic idealist who believes that to gain quality education,we need to transform the way we teach & learn.I am the Co-Founder at www.greyatom.com