How to Use Power BI for Predictive Analytics: 3 Examples from Beginner to Advanced
While Power BI is primarily known as a powerful business intelligence and data visualization tool, some of its features can also be used for predictive analytics. These capabilities can enable data analysts and data scientists to experiment and answer important questions without significant investments in deployment. Here are several ways in which Power BI can be used for predictive analytics:
Method 1: Built-in Forecasting Features
Power BI has built-in forecasting capabilities that allow you to automatically visualize future trends based on historical data. By using line charts and adding forecast lines, you can easily predict future sales, revenue, or other key metrics. Adjusting the forecast length and confidence intervals helps tailor the predictions to your specific needs.
Below is an extract from a dummy sales dataset found on Kaggle.
Here, sales transactions are captured at the row level with several key fields, such as the location of the sale and its associated regional data.
By using a simple line chart to plot sales over time, we can then click on the Analytics icon in the Visualizations pane and select ‘Add’ under the Forecast section (shown below).
From here, we can configure the forecast length (for example, 1 year), along with other options such as confidence intervals and the option to disregard anomalous periods or adjust for seasonality, which gives us a forecast for overall sales (as in the image below).
As the location is also captured, we can easily replicate this predictive analysis across this filter criterion for a quick view of how this will affect the predicted forecast figures. Given the speed at which we can replicate visuals in Power BI, we can very easily create multiple visuals showing how each region is forecasted to perform in comparison to the overall trend, as shown below. In this example, it is also useful to see that although the confidence intervals are set equally, regions with more volatility will show a much wider forecast range than those that have shown consistent trends.
Method 2: Using DAX Measures for Custom Forecasting
While Power BI’s native language DAX is typically used for business intelligence tasks like calculating totals and aggregations, it can also be applied to simple predictive tasks. For example, you might use DAX to calculate the average time between sales to predict when a product is next likely to be sold. This, in turn, could create valuable sales quantity predictions over the short term to help dictate pipeline logistics.
Example: Pipeline Predictions
Using a similar dataset as an example, another interesting use case may be to look at the quantity of sales predicted over the next 30 days. This would be more difficult to do with Power BI’s ‘out of the box’ forecast features, so some custom analysis is required.
There are several DAX functions we can write to help us calculate the following:
- Time Between Transactions: Here, we can use basic functions such as MAX and DATEDIFF to create a column that will calculate the difference between any transaction and its previous one for the same product.
Days Between Sales Product =
VAR CurrentDate = Sales[Order_Date]
VAR PreviousDate =
CALCULATE(
MAX(Sales[Order_Date]),
FILTER(
Sales,
Sales[Product_ID] = EARLIER(Sales[Product_ID]) &&
Sales[Order_Date] < EARLIER(Sales[Order_Date])
)
)
RETURN
IF(
ISBLANK(PreviousDate),
BLANK(),
DATEDIFF(PreviousDate, CurrentDate, DAY)
)
- Average Time Between Transactions for Each Product: This measure builds on the last by using the AVERAGE function. Here, we can average out the days between transactions for each product. This could be written as a basic measure instead of a column.
Average Days Between Sales Product =
CALCULATE(
AVERAGE(Sales[Days Between Sales Product]),
ALLEXCEPT(Sales, Sales[Product_ID])
)
- Predicted Next Purchase Date: Finally, this last column will predict the next purchase date by finding the last transaction date/time and adding the average number of days between transactions to that date/time.
Next Sale Date Products =
VAR LastSaleDate =
CALCULATE(
MAX(Sales[Order_Date]),
FILTER(
Sales,
Sales[Product_ID] = EARLIER(Sales[Product_ID])
)
)
RETURN
IF(
ISBLANK(LastSaleDate),
BLANK(),
LastSaleDate + [Average Days Between Sales Product]
)
The effectiveness of this type of predictive analysis depends heavily on several factors such as the size of the dataset and the nature of the use case. The potential power of this approach is also increased if additional fields are captured in dimension tables for products and customers. Depending on the use case, the analysis can be easily tailored to focus on customer-based transactions by modifying the original calculated column to track gaps between customer transactions instead of product transactions, or any other variable. Moreover, with a comprehensive sales table containing a wide range of fields, this analysis can be applied across multiple filter criteria to observe how trends and patterns may vary. Although not quite synonymous, this approach is akin to regression analysis, where various variables are isolated to determine if correlations persist.
Method 3: Using R or Python Scripts in Power BI
For more advanced methodology, where statistical models are required, it is difficult to achieve this with DAX alone. Here, it may make more sense to carry out the modeling in Python or R before importing the script into Power BI. This can be done at the Power Query stage by selecting the ‘Run R’ or ‘Run Python’ option in the top banner.
In a scenario where the business is aiming to predict sales on a future day based on previous data, a suitable Python script may include the following steps:
- Importing the relevant libraries such as NumPy, Pandas, etc.
- Extracting all relevant fields from the dataset
- Preparing the data for modelling (i.e., one-hot encoding category values as numerical)
- Defining the independent (X) and dependent (y) variables. (In this case, Product Category and Date would be independent, and ‘Sales’ would be the dependent variable.)
- Splitting the data into training and test sets
- Creating and training the model, i.e., model = LinearRegression()
- Making the predictions, i.e., predictions = model.predict(X)
- Plotting the results
The script can then be used in conjunction with the Power BI Python visual to visualize the results.
Conclusion
Power BI’s versatility extends beyond traditional business intelligence and data visualization, making it a valuable tool for predictive analytics. By leveraging built-in forecasting features, custom DAX measures, and advanced scripting with Python or R, users can gain deeper insights and make data-driven decisions with greater confidence. As shown in the examples above, whether used by a beginner exploring basic forecasting or an advanced user implementing complex statistical models, Power BI provides the flexibility and power needed to address a wide range of predictive analytics scenarios. Embracing these capabilities can significantly enhance the analytical toolkit of data analysts, enabling businesses to uncover trends, predict future outcomes, and drive strategic business initiatives.
About the Author
Liam Williamson is a Data Visualisation Consultant here at Version 1.