Business Managers are from Mars, Data Scientists are from Venus
Written by Deepa Naik
This is more evident than ever as AI projects move out of the labs into real-world applications. Organizations are building teams which constitute people with the domain, analytics, technology expertise each operating in their own silos. For projects to be successful and cross the AI chasm, we need cross skills overlapping business analysis and data analysis spheres. The two worlds need to merge.
The Need for Business Managers with Analytic Skills
For the handshake between business and AI to be meaningful, each needs to walk halfway. So the question remains as to how much businesses know-how should an AI expert have and how much AI skills should the domain consultant acquire. For the business manager to get into tools of deep learning or machine learning does not really make sense. It will be like the 1990s where people were taking up “Learn VB Script in 20 Days. More than the tool, it is the fundamentals that matter. And the fundamentals are in knowledge of the body of Statistics and AI which has existed long before big data came in the picture. Concepts and algorithms can be understood much better when explored with smaller data sets and simple tools like spreadsheets. You can then get an ML expert / Deep Learning / Big Data expert to use the sophisticated tools and the business manager can communicate with him easily and scale your analysis algorithm to the big data world.
In the age of data explosion, everyone wants to make decisions which are based on analytics, statistics et al. There are a lot of tools that will provide you good recommendations; however as a marketing or sales professional you want to rely on your judgement and not solely on some prediction made by ”intelligent” software. What makes it easier to trust your judgment is if you augment your decision using the AI tool. And for that, you need to know the basis of how the tool made its predictions in the first place.
We look at a simple example of ‘Promotions’ which is a very common and practical use case for marketing managers. How does one exploit the existing data in the organization, use analytics get a better business outcome on your promotional event?
Understanding Bayesian — An Example
We get into the nitty-gritty of the Bayesian Network Algorithm, understand it and try and apply it to this scenario. At the heart of the Bayesian Networks is the Theory of Probability. The classroom example of the toss of a fair coin and the probability that you get a head is 1/2 or 0.5 — as there are two outcomes and probability is always 1. It will be either heads or tails. Similarly, the probability of a ‘sale’ or ‘no sale’ of says a product (maybe you want to track the one with the highest margin) say Laptop Model XYZ is 0.4 (40% possibility of a sale). How do you arrive at this number? You look at the past data for this product sale and calculate this number. This is a simple probability.
However, you then start refining this number by applying conditions. What are the factors that decide on the sale — gender, age, income? Let’s take the first condition. What is the possibility (read probability) that a male customer buys the product? The past data shows that 70% of your laptop buyers have been men (0.7 probability).
This gets us to the scenario which is the fundamental of the Bayesian Theory- Conditional Probability. What is the probability (possibility) that a laptop is sold ‘given’ that the customer is male? This will give you a number and as a business manager, you don’t really have to worry about the mathematical calculation behind that number (you can if you want to!!). A tool will provide you with the same or you can get in a data analyst who can provide you with the number.
Similarly, this can be extended forward, to age groups. What is the probability (possibility) that a laptop is sold given the customer is male and given that he is in the age group between 20 and 35?
This can be represented as follows:
So finally applying a set of conditions you can refine the numbers. The conditions that give you the best possibility (based on past data) are where you will focus your promotional campaign. So, based on the data you decide that the campaign has to be for men in the age group 20 to 35, with a graduation as the basic degree, living in a certain suburb.
This number will be more realistic depending on how realistic you are in defining your conditions (domain knowledge) and how reliable the past data is and what kind of volume you are using for your past data.
Applying Bayesian in the Real World
This, of course, is scratching the surface. Bayesian Networks can get much complicated than that. You can have the following scenarios
- Cyclic conditions
- Probabilities for continuous variables ( sale was a discrete variable with value 0 or 1)
- Prior, joint and conditional probabilities
- Multi-class Classification
- Gaussian Bayesian Networks
Here is an interesting article if you want to get into the details of Bayesian Network Algorithm.
However, the roadmap is, once you have a sample data with which you have come up with your analysis model, you can then use various tools and involve a data analyst/data scientist and come up with complex models. You will need to provide past reliable data so that the data scientist can train the model (in this case the assumptions of the various probabilities).
There are a number of tools in the market, the popular one being from BayesiaLab which work in this area. A visual tour of the Bayesian Lab is an interesting video for starters. There are statistical tools R and SPSS. There are also frameworks and tools that can be developed using programming languages Python and Java. Here is a list of 10 free and open source Bayesian network software.
The Challenges and the Future
The challenges remain though; the biggest one being the reliability of the past data and the data quality. Is your organization capturing the data correctly and accurately to base your decisions on? The second one being — are the right pieces of data being captured. You know from your business know-how that it is the market conditions that is one the most important factors, however, you don’t have any data captured for that condition in the past.
We looked at the Bayesian Algorithm for promotions. Similarly, there are different algorithms for different business scenarios. Linear regression algorithm is used in financial portfolio prediction, in traffic in arriving at ETAs. Support vector machines algorithms have found several applications in the oil and gas industry, classification of images and text and hypertext categorization, Multivariate Analysis in the retail sector where the customer makes a choice of brand, price, product etc. However, one thing is clear, that algorithms are going to be important skills for Business Managers and an area they need to venture into sooner than later to make the most of the plethora of AI tools flooding the market.
- Which skills are most valuable in machine learning?
- 10 free and open source Bayesian network software
- Bayesian in R — an interactive visualization
- Bayesian Learning for statistical classification
- Marketing Mix models with Bayesian Networks
- A Tutorial on Learning with Bayesian Networks
- Taking Bayesian Networks Into the Business
About the Author:
Deepa is a founding member of Humans For AI, a non-profit focused on building a more diverse workforce for the future leveraging AI technologies. Learn more about us and join us as we embark on this journey to make a difference!