The power of Python in Power BI

Yang
CodeX
Published in
5 min readSep 11, 2021

Running Python scripts in Power BI has been an exciting feature for many data analysts and business intelligence professionals, and it’s easy to see why. Power BI is a powerful Microsoft data visualisation and analytics tool. Its user-friendly interface and drag-and-drop functionalities have enabled many non-technical people to create informative and presentable dashboards. As a popular programming language in the data science and machine learning world, Python is well placed to complement Power BI’s capabilities. Python has all kinds of libraries that can help you from extracting and cleaning the data, making customised visualisations, analysing data in different dimensions, predicting and forecasting data, etc. Therefore, combining these two can be an excellent strategy.

Source: https://unsplash.com/photos/Q1p7bh3SHj8?utm_source=unsplash&utm_medium=referral&utm_content=creditShareLink

As a Power BI specialist with a basic background in Python, I decided to learn more about how to use Python in a data science project and try to apply this knowledge in Power BI. I followed an online course called Python A-Z™: Python For Data Science With Real Exercises! on Udemy, which I highly recommend to people who want to learn how to use Python in Data Science. This course introduces you to Python’s basics, such as installing and coding using libraries for preparing and visualising data.

After I finished this course, I thought I was ready to start my first Python and Power BI project. However, I was stuck at running Python scripts in Power BI. To achieve this, we will need to create a virtual environment. In addition, I’ve installed the Matplotlib and Seaborn libraries to start with. I’ve played with these two libraries in the Udemy online course and found them easy to use. I also installed Plotly, which includes functions for creating map controls.

Then we need to set the Python home directory in the Power BI desktop. Otherwise, you will get an unreadable error when you try to run anything in Power BI.

Set Python home directory; screenshot from my local machine

Once the environment is ready, we can dive in. The lifecycle of data science or data visualisation projects is similar no matter what technologies we want to use. The first step is to identify what we want to analyse. Next, we need to prepare the data. This includes getting and transforming the data. Then, we can analyse and visualise data.

In my first project, I planned to do some analysis on all summer and winter Olympic medals’ attributes, such as weight, diameter, etc. After googling, there isn’t a CSV or excel version of the dataset readily available online. The only dataset I could find was a table on a Wikipedia page.

https://en.wikipedia.org/wiki/Olympic_medal

Powerful data connectors are one of the great advantages of Power BI. We can easily use one of them to connect data from SQL database, SharePoint, Dynamics 365, and so on. But as this article is about the power of Python. I decided to give it a go.

Get data in Power BI ;screenshot from my local machine

As you may already know, installing different libraries in Python can save considerable time compared to writing code from scratch. Luckily, I found that the Wikipedia and Request packages are the ones I need. There is a useful article that tells you how to do that, Web Scraping Wikipedia Tables into Python Dataframe | Analytics Vidhya (medium.com)

By following the instructions in this article, I managed to extract the Olympic medals table from the Wikipedia page and load it into Power BI.

Olympic Medal table in Power BI; screenshot from my local machine

Normally for a data science project, using Python to prepare the data, such as changing data type or aggregating columns is the next step. But for this purpose, I prefer to use Power Query and DAX because I’m more experienced with those two than Python.

The next step is to analyse and visualise data. I click the “Py” button from the “Visualizations” area, choose the columns from the table we just loaded into Power BI, and then write Python scripts in the area below.

Python distribution chart in Power BI; screenshot from my local machine

I created the above distribution chart, which analyses the relationship between each medal’s diameter and thickness. Thanks to the Matplotlib and Seaborn libraries, only one line of code is needed to specify the columns you want to report on to create this chart. This demonstrates how simple Python is at making charts involving distributions of some sort, which can allow you to create more advanced Power BI reports.

However, if you only need simple line or bar charts, I suggest using Power BI’s built-in charts instead. Such charts are easy to configure and change — all without requiring code. Furthermore, loading Python visuals takes more time than the built-in charts, so it makes more sense to allocate the longer loading time to more complicated charts that Power BI can’t achieve by itself.

A comparison of line charts: Python (left), Power BI built-in chart (right)

Maps are a desirable visualisation in reports involving geographic data. Creating a map visualisation just needs a few clicks if you are using Power BI’s built-in chart.

A Power BI built-in map visualisation; screenshot from my local machine

One limitation of using Python in Power BI that I discovered is that not all Python libraries are integrated with the Power BI frame. For example, I made the following animated map visualisation using Python’s Plotly library. It’s slightly more complicated in terms of defining attributes such as location and size in code. But the visual effect looks fancy because this map is animated. We can’t display such an animated map in a Power BI report yet due to the integration issue.

A Python animated map visualisation; screenshot from my local machine

In this article, we’ve seen what Python can do in Power BI using some simple examples. Python unlocks great potential in terms of getting data and creating customised charts that are more complicated. However, there are some limitations of Python visuals. As we mentioned before, Python can slow down performance, and it can only retrieve up to 150,000 records. I can imagine this being problematic for big business intelligence projects.

This is just the beginning for me to explore what Python can do in Power BI. I may have more ideas to share in the future!

--

--

Yang
CodeX
Writer for

Living in London | Technology | Data | Museum | Food