Extreme Makeover Power BI Edition

Lorenzo Vercellati
Data Pied Piper
Published in
6 min readMay 19, 2021

Have you ever seen Extreme Makeover — Home Edition? The TV show where experts renovate a house transforming and enhancing it?
Well, today we are going to try to do the same thing with Power BI.
A large portion of Power BI users are business users, with a strong excel background. This means that for many of them a data model has always been nothing more than a huge table of data with many columns to analyze with a pivot table. In the end it is a matter of habit. A habit that, once you start using Power BI, should absolutely be eradicated.
A data model is a set of tables, correlated by relationships and whoever develops Power BI reports must develop a mindset oriented in this sense.
Now I’ll show you how to transform with Power Query the old huge excel table into a real data model, structured according to the best practices of the star schema.
For this purpose I simulated a large single table using a view built on the AdventureWorksDW2019 database data.
This is not a very big table, but it will be more than enough for our purpose.
The table has more than 60 thousand rows and over 20 columns: there are the values to be measured but also product codes and description with its attributes, retailer data, geographic area and so on.

A matter of habit

Power Bi is a great tool, very flexible, and you can definitely import this data source as it is and build your own report. But not because of that, you actually have to do it.
Why?
I’m not going to thoroughly explain why, but I refer you to this very clear explanation by Marco Russo and Alberto Ferrari about the different reasons why a star schema is always better than a single table. I want just remember you, so far, a quote from the two MVPs above: “The speed, reliability, and power of a solution all stem from the same origin: a good data model.
Today, we will instead focus on how to get the former from the latter.

Let’s start with the original query, the one used to import the single table, as is. Since we need it as initial connection, but we don’t want to import it in the model, first we disable its loading, right-clicking on the query and unchecking the Enable load option.

When load disabled, the Single Table query appears in Italic font.

Now let’s start creating the tables that we’ll bring into our data model.
Let’s start with the product. First right click on the single table query and select Reference.

A new table is created which we rename Products. This new query uses the final result of the original one as data source. In the newly created query, select the columns ProductKey, Product, ProductSubCategory and ProductCategory, right-click on one of them and select Remove other columns.

Now we have a table with only the columns we are interested in, but with multiple rows for the same tuple. We, on the other hand, want a lookup table, with only one row for each product.
So let’s select the four columns again, right-click and choose the Remove Duplicates option.

The Products lookup table is right!

Now we can repeat the same steps for the others dimensions.

Now that the dimension tables are ready, we need to build the fact table.
The fact table does not need descriptive fields. So let’s select only the key fields that we will use in the model for the relationships with the various dimensions and the columns with the values and then click once more on Remove Other Columns.

Now Close and Apply and look at the data model we created.

We have a perfect star schema, with the fact table in the middle and all the dimension tables related to it by one-to-many relationships.

We are almost at the end, because, before starting dragging and dropping the visuals in the canvas, we need one dimension more: the Calendar table!

Someone might say, “Hey, but there’s an auto date/time option. It’s so convenient.
Sure it’s convenient, but that doesn’t mean it’s the best choice. I’ll explore this in a future post dedicated to the Calendar table, but right now it’s enough to think that Auto date/time gives us a very poor table, with only year, quarter, month and day.
Instead we will quickly build a richer and expandable table with all the columns we need.

Let’s go back to Power Query again and build a new query referenced by Single Table. Let’s remove all columns except the OrderDate.
Our goal is to create a calendar table that contains all the dates of the years to which our data refers.
We then need to find the minimum and maximum years in our data and use them to build our list of dates which will be the pillar on which we will build our calendar table.

We first derive the year from the date.
To do this just click on the Date button in the Transform section and select Year.

Well, now it’s time to harness the power of the M language!

We could still use Power Query to extract the minimum or maximum year from the column, but then we would still have to use M to calculate the other limit. So better to get started now!

Select the Calendar query and open the Advanced Editor

We use the List.Min and List.Max functions to calculate the minimum and maximum year, starting in both cases from the ExtractYear step.

Now we can get the start and end dates of our list, the number of occurrences it will have, generate it and transform it into a table.

Now that we have our calendar table with the key column, all we need to do is add all the columns we need like year, month, quarter, day of the week and so on.

Close and Apply and we will have our complete data model, well formed, scalable and easy to manage.

Now we can start creating tons of powerful measures and designing an amazing and performing report.

--

--

Lorenzo Vercellati
Data Pied Piper

PowerBI Solutions Architect, SQLSat DIAD PPWT Speaker, Medieval History Graduated, Football Addicted, HomeBrewer, Springsteen & Pearl Jam Fan