Supercharge your Business Intelligence with Automated, Explainable ML
TL/DR: If you’re a business analyst, you’re probably constantly faced with questions about likely future business outcomes, and what factors affect past and predicted outcomes. Explainable ML is the most reliable way to answer these questions, yet most BI tools lack an easy way to fit ML models and generate attributions. Working with Data Science teams is the default solution today, but this is slow and expensive, and it can take days or weeks to produce trustable answers. XaiPient’s XBI (Explainable BI) platform is an augmented BI tool that provides a simple, familiar BI experience where you can leverage automated, explainable ML via simple SQL queries, and get your answers in minutes to an hour.
Everyday questions for Business Analysts
Analysts in marketing operations, business operations, customer analytics or product analytics departments are constantly asked to answer questions like, What combination of customer attributes have a strong influence on rates of conversion (i.e. purchases, subscriptions, or some other desirable outcome)?Which product features drive the most engagement?
Which types of customers are most likely to stay loyal, and which are likely to churn?
Simple counts/stats don’t work — they need ML
The analysts’ first impulse might be to look at their historical business data and compute simple counts and statistics of combinations of attributes (say job and education), but for a variety of reasons, the answers will be inaccurate and misleading.
A much better approach is to fit an ML model to the historical data. ML algorithms are far more efficient at capturing hidden patterns in large datasets involving tens, hundreds or thousands of variables.
Analysts struggle (and wait) to leverage ML
Ok so the analyst needs ML — now what? A typical business analyst would know how to use a BI tool, and some SQL, but certainly wouldn’t want to write Python code for an ML model. They’d search around for a Data Scientist at their company. If they’re lucky, they’d find a Data Scientist who has the time to handle this type of ad-hoc request. The Data Scientist would then connect to the Data Warehouse where this data sits, download the data, understand it, explore a few different types of models (in their Jupyter notebooks) to predict the desired outcome (i.e. a conversion). In a week or so they might settle on a model, implement it, debug it, and then deploy it for the analyst to use.
Model explanations are needed for trust and actionability
It’s not enough to just build an ML model. To answers the types of questions we mentioned above (e.g., what factors drive conversions?), analysts need model explainability as well, for at least two reasons: (a) trust: understanding what factors contributed to a prediction helps the analyst validate and trust them, and (b) actionability: many of the model inputs are adjustable business knobs or levers, so for example if it turns out that on average a certain campaign has no influence on predicted conversion rates, then that campaign can be turned off, thus boosting ROI.
For simple tabular datasets, a data scientist might be able to leverage one of the many open-source Python packages to compute one type of model explanation, namely feature attributions (i.e., how much each model input contributes to the prediction, on specific cases, or in aggregate on some data-slice). But for complex event-sequence data common in business settings, open source explainability solutions are often not suitable, and a more custom approach needs to developed, which costs additional days of data scientists’ time.
Working with a Data Science team can take days to weeks.
Let’s look at how expensive this approach to ML and explainability is:
- Data Scientists are best suited to work on the “core”, critical Machine Learning problems of a business. One week of their time could cost anywhere between $100K to $200K on an annualized basis, and this does not even take into account the time cost of not focusing on their core function.
- A week, or even a few days, is a long time to wait for an analyst to answer the numerous ad-hoc questions they are faced with; for example a quicker answer might have helped them to improve their targeting strategy rapidly, so the potential opportunity cost could well be multiples of $100K annually.
XaiPient’s XBI (Explainable BI) platform: Predictions, attributions in minutes to an hour.
At XaiPient we passionately believe that today there’s no reason why non-technical analysts should be struggling to get trustable, actionable ML-based insights from their business data. Our vision is to make automated, trustworthy ML accessible to these analysts.
Our first product embodying this vision is XBI (Explainable BI), an augmented BI platform that combines all the usual BI functionalities, with the ability to automatically fit ML models to historical data, and get predictions and feature attributions within minutes to an hour, instead of days or weeks. We chose to build XBI on top of the popular Redash open-source framework so analysts can connect to any of the popular data warehouses or SQL Databases, and get a simple, familiar BI experience where they can analyze their data, train ML models and get predictions and attributions with simple SQL queries. For model-training and feature attributions, we’ve defined special SQL commands that leverage our API under the hood. The app currently handles (non-sequential) tabular datasets, and will soon support event-sequence data using models and attribution techniques based on our recent ICML paper. As we add more capabilities to our API, they will be made available in this app.
Example: What factors influence conversions?
Let’s see XBI in action using the popular Bank Marketing Dataset from UCI. This is a simple tabular dataset representing customer responses (subscribed or not) to a service that was being marketed by a certain Portuguese bank. Each row corresponds to a customer and consists of attributes of the customer and campaigns, and the binary column converted
indicates whether the customer subscribed or not. Suppose the analyst wants to understand which combinations of job
and education
had the most impact on conversions. As we mentioned above, trying to answer this with simple stats or counts (using existing BI tools) is likely to be misleading and inaccurate, and an ML-model-based approach would be far superior. Instead of spending days or weeks waiting for a Data Science team to build an ML model, the analyst can simply run this query:
SELECT attributions
FROM `bank_data`
WHERE TARGET = converted
When this query is run, several things happen under the hood, within minutes to an hour (depending on the size of the dataset): (a) an ML model is automatically fitted to the dataset to predict the converted
column, and (b) the original table is augmented with a new prediction
column, and for each feature column F
, there is a new attribution column F_a:
for example the column job_a
contains, for each customer (i.e., row), the attribution (i.e. contribution) of the job
attribute to the model prediction of that customer. Here’s what this looks like:
It turns out this type of augmented table is very convenient. For example to find out what combinations of job
and education
impact conversions the most among those under 40, the analyst can aggregate the attributions job_a
and education_a
over data slices grouped by job, education,
after filtering by the age_under40
condition:
A similar query gives global importances of the variables:
SELECT global_importances FROM `bank_data`
WHERE TARGET = converted
The analyst can then quickly create a dashboard based on queries such as the above:
Using the XBI platform can make a business analyst feel like they have super-powers. If you’d like to see what this feels like, we invite you to try it out for free at https://bi.XaiPient.com . This is an early beta, so we’d love to hear your feedback.