Data Modeling in Power BI: What Is It, and Why Is It Useful?
For almost every transaction, there are usually four questions that can be asked about its occurrence: who, what, when, and where? For most roles in business analytics, being able to easily answer these key questions leads to better decision-making, planning, and root-cause analysis. In my role as a Demand Planner, answering these questions is fundamental to understanding a forecast. Questions like: “who is ordering this product?”, “when is inventory expected to reach the Distribution Center?”, and “which accounts are in the forecast?” are all important questions to answer quickly and with conviction. In the following scenario, I will show how importing and modeling Excel transactional data in Power BI using a STAR data schema can assist in your data analysis.
Fact Tables: The Numerical Data
These tables contain transactional data in a business setting, and contain both the numeric data and foreign keys that are contained as unique keys in the dimension tables. For instance, a fact table may contain sales amounts, with a column specifying product key, transaction date, customer ID, and customer zip code. In a fact table, there will likely be duplicate values — in fact, this would indicate a healthy business, as one would expect repeat customers, product orders, etc. In reality, these fact tables are usually viewed when aggregated by their dimensions (i.e. total sales by time, sales by customer, sales by region, etc.) and less often examined at the line-level of detail. An example fact table is shown below (example data is from a Sample Superstore dataset provided by Tableau. Link to source data is here ).
Dimension Tables: The “Who? What? When? and Where?” Tables to Every Transaction
Dimension tables are descriptive tables that give more detail about your data. For instance, let’s consider our sales dataset, that contains four columns: a product key, a sales date, a customer ID, and a zip code. Dimension tables would help answer more questions about the information contained in the fact table (i.e. product category/sub-category of the product ID, customer name/segment of the customer ID, state of the zip code, etc.). Unlike fact tables, these dimension tables contain an unduplicated primary key that is joined to the fact table. Below I have posted an example dimension table (Product Table) which contains descriptive information about each product ID.
Can’t I just merge data together or perform a VLOOKUP?
While it is easy to want to pull in attributes to a transactional table via a merge, it is less repeatable and more expensive in terms of space and time. Merging data not only adds columns, but also requires formulas/joins each time more dimensions are pulled in. By creating dimension tables and linking them to fact tables, it is easier to create dynamic charts from transactional data without having to merge in new information each time. Less time spent digging through different data sources means more time finding valuable insights.
A Simple Model: The STAR Schema in Power BI
After importing the transactional data and dimension tables (product, customer, and region), I created an unduplicated date table within Power BI. Power BI has excellent functionality in creating date tables; I used the DAX query CALENDAR() to set the parameters for this table. Once these were imported, I went into the “Modeling” tab of Power BI, and created relationships between the dimension tables and fact tables. This created a “star” schema, with the central data being the fact table, surrounded by the dimension tables. Any measures in the fact table (Transaction table) can then be broken out by these dimensions.
The End Product: Repeatable and Effective All-in-One Dashboards
As seen in the dashboard above, this STAR data model helped create an end product with far easier-to-understand presentation of the data than the original fact table. When dimension tables are linked to their relevant fact tables, it becomes far easier to answer the “who”, ”what”, “when”, and “where” questions that frequently occur in a business analytics setting. For instance, by joining these dimension tables to their fact tables, this Power BI model segments sales by customer, product category, date, and region — all in one dashboard!
PS: If you enjoyed this article, or have any feedback, I would love to hear it. Feel free to add me on Linkedin or shoot me a message: