Power Query Quickbytes# 2: Summary Statistics using Data Preview
QuickBytes is a series of articles for simple and easy to implement techniques with exponential benefits
I love the Data Preview feature in Power Query. As an analyst, I would prefer to get an overall summary of data. Data Preview is a great way to achieve this.
Benefit# 1: Visual representation of the overall health of the column
The bluish-green bar below the header name visually represents the health of the column:
- A solid greenish-blue line represents non-blank values
- A greyish-black fill represents null
- A red fill represents the presence of errors (primarily due to invalid data conversions)
When hovering on the bar, it displays additional details:
Check the Column Quality option under View to display the details:
Benefit# 2: Column Cardinality
Cardinality is the representation of the presence of unique values in a column. The higher the number of unique values, the higher the cardinality.
For an efficient data model, Power BI experts recommend removing the unnecessary columns, especially with high cardinality. Tables with fewer cardinality columns help in better table compression, hence faster data loading and query execution.
Use Column Distribution feature to show the distribution and the cardinality details quickly:
Benefit#3: Summary Statistics using Column Profile
Getting a summary statistic of a column is one of the first steps performed in any data analytics. It helps in getting an overview of data distribution.
Enable Column Profile:
View > Column Profile
Enabling Column Profile in Power Query provides a visual representation of the distribution along with a summary statistic.
For a number column:
For a text Column:
Upon hovering, it provides additional details (count, % contribution):
We can copy-paste the summary statistics:
We can filter our data table using options available in Column Profile
Defining the Scope of Data Preview
By default, Power Query shows only the top 1000 records for preview. We can view the status at the bottom left corner of the Power Query window:
We can, however, increase the scope to the entire data set, by clicking on it and selecting changing the selection :
A word of caution: The processing time to show the Data Preview will change according to the number of records in the dataset and the processing power of the machine. So, use it wisely.
I write about MS Excel, Power Query, Power BI, Power Pivot, DAX, Data Analytics, and sometimes travelling.