Power Bi Dashboard using web scraping with Python

Evangelos Zafeiratos
5 min readJan 26, 2022

--

Combine multiple data sources to build a meaningful dashboard

Power Bi Dashboard displaying Greece macro economy KPIs
Interactive Power Bi dashboard

Information and data is consumed at an enormous pace in our modern society, whereas at the same time the human attention span has markedly decreased also due to the bombardment of stimulation from multiple sources.

Data Visualization is powerful with its intrinsic ability to offer condensed pieces of information in a manner which is friendly to the human brain, thus it is gaining more and more popularity. Moreover, economy and finance has traditionally been one of the domains where data visualization is heavily used and today this is no exception.

The facts above generated the idea for this project which is to combine several macro and price indices of Greece economy into a single dashboard. Logic behind the selection of the specific metrics is to provide a glimpse into Greece economy that would be meaningful for the average person (the dashboard is by no means suited for investors as it doesn’t deep-dive into the details) .
Purpose of this article is to describe the methodology and share some insights on the challenges I faced that might be useful to the ones working on similar projects.

Phase 1 : The Research

As already mentioned, metrics for the dashboard were selected having the average person in mind : what would someone living in Greece need to know about their country’s economy deriving from the indices? I concluded to the following 4 pillars :

  1. Stock Market Index
    In every modern country the stock market is a vital part of the economic ecosystem and is as well frequently used as an indicator of the country’s performance. I decided to use the Athens Stock Marcket (ΧΑΑ) index value but to skip all the individual indices.
  2. Basic Macro-economic Metrics
    A proper analysis of a country’s economic outlook would involve dozens, if not hundreds of KPIs, metrics and indices. I have included just a bunch of them, the ones I consider as the most impactful in people’s everyday lives:
    - GDP growth : the annual rate in which the economy grows
    - Unemployment : the % of economic active population who are jobless
    - inflation : the rate of rising prices of goods and services
  3. Bonds
    A common practice for raising money either by countries or corporations in today’s economy is by issuing bonds. The yield by which the country borrows money is a very reliable indicator of the country’s credibility to the money markets. For Greece with her recent economic turmoil, these indicators are even more important.
    I have added the long term bonds that the country has issued over the last years, both 10Year and 20Year bond.
  4. Energy Price
    Out of the multiple price indices, those concerning energy have the most significant impact in economic activity and purchasing power of the population consequently. I chose to include graphs showing the price variation of the most commonly used energy sources : diesel & unleaded representing fuel and electricity as the main household energy source.

Phase 2 : Finding Sources

Finding an API to retrieve all of the data mentioned above proved to be a non-option as all the relevant APIs were either insufficient or too costly. Next available option was to use web scraping to retrieve this data from existing websites. This option has the following risks:
a) Websites collecting data (especially the ones offering the option of paid API to get access to their data) could potentially enable a strict policy against scraping, which would break the gateway to our dashboard.
b) Web scraping techniques are based on the exact structure of the website at the given moment to access the necessary data, therefore any slight change in the HTML or CSS code (i.e changing the class name of a table) might break our code as well.
However, for a non-production project this is an acceptable risk. Plus, I looked for relatively stable websites as my data sources. The ones I selected :

Trading Economics : Used to retrieve all macro economic indices, bonds & stock market index.
Statista : Data source for monthly electricity price
fuelo.net : Data source for average fuel price (diesel & unleaded)

Phase 3 : Writing some Python Code

I used Python and the powerful BeautifulSoup library to parse the HTML from the source websites. Since Power Bi does only recognize Pandas DataFrame data structure as potential input, I had to make a conversion to DataFrame for each individual table.
I divided my code into 4 separate functions (one for each data retrieval process) and after ensuring it retrieves the data in the correct format I used this code as my data source in Power Bi (which I will describe in Phase 4).
You can see through the schematic below the steps beginning with our integrated Python script and ending in the final data model we will use for the Dashboard. In a nutshell:
1. HTML Retrieval
2. HTML parsing and conversion to DataFrames
3. Power Query conversion of DataFrames to data tables

I will not dive into more details on the python script, you can have a look at the source code residing in this repository.

Dataflow
Project dataflow structure

Phase 4 : Power Bi Configuration & Presenting Data

Power Bi offers countless ways of importing data to your model, one of which is through a Python Script.

Screenshot from Power Bi Get data menu
Screenshot from Power Bi Get data menu

Power Query will run the script, retrieve the data and store them in separate tables. Our next consideration would be to confirm that the data types Power Query has presumed are the appropriate ones, and make changes if applicable.

Once our data model looks like in the screenshot below we are ready to jump into the visualization process!

The visualization is rather simple and I used only 3 types of visuals to achieve : Cards, a custom Scroller & Line Charts.

Cards are used to diplay all macro values (GDP growth, unemployment, inflation) and bonds price and variation.
A scroller is utilized to display the stock index whereas line charts are adding some interactivity to the dashboard, as they display a whole set of values which can be previewed once the user hovers the mouse over the graphic.
Finally, a slicer linked with the line charts only can be used to extend or shorten the time span of the fuel price observation.

You can see the final version of the dashboard in the link below:

Thanks for reading!

--

--