More flexible object sorting in Power BI
In this article, I will discuss a workaround that provides greater flexibility in sorting objects within Power BI. While Power BI offers built-in sorting options, they can be click-consuming and may not always support specific actions, such as adjusting the sorting of attributes used in matrix columns.
Here is my proposal on how to address the requirement for more flexible sorting in a user-friendly manner.
Prerequisites:
- Duplicate Columns: We need to create duplicate columns for the attributes where we want to have different sorting options for attribute elements. These duplicate columns will allow us to apply secondary / tertiary / quaternary / ∞ sorting.
- Sorting Columns: We need to create additional columns that will serve as sorting columns. These columns will determine the sorting order for the attribute elements.
- We also need fields parameter tables and linked tables to help users select attributes for visualizations and define sorting options. These components allow users to easily choose attributes and specify their preferred sorting options.
To handle duplicate and sorting columns, you can utilize for instance database views, such as metastore views in Databricks in my case. In the example I will be presenting, two scenarios will be covered to (hopefully 😉) explain the purpose behind constructing the aforementioned views:
- Enable sorting of time attributes (years and year months) in both ascending (oldest to newest) and descending (newest to oldest) order.
- Enable sorting of product and store attributes in two ways: alphabetically and by revenue in the last closed year. However, please note that this approach won’t be dynamic.
To create the calendar dimension view, you can include duplicate columns for the year and month_num attributes. Additionally, you can add the exact same columns multiplied by -1. This approach allows you to utilize the “Sort by column” feature in Power BI for sorting purposes. By incorporating these duplicate columns with negative values, you can effectively sort the calendar dimension in the desired order within Power BI.
When creating the product and store dimension views, the approach becomes slightly more complex. In addition to adding duplicate columns for attributes such as brand, category, country, and customer, you also need to include revenue from last closed year at each of those levels, multiplied by -1. This multiplication by -1 allows for applying business sorting to all the duplicate attributes.
To handle sorting properly, you can use the COALESCE function along with the value 999. This is done to handle the sorting of elements for which there is no revenue in the last closed year.
schema = "your_beloved_schema"
#-------------------
# calendar dimension
calendar_query = f"""
create or replace view {schema}.calendar_dim_vw
as
select
date_num
, month_num
, year_month_name
, year_month_name as year_month_name_dupl
, month_num *-1 as month_num_neg
, `year`
, `year` as year_dupl
, `year` *-1 as year_neg
from
{schema}.calendar_dim
"""
spark.sql(calendar_query)
#------------------
# product dimension
product_query = f"""
create or replace view {schema}.product_dim_vw
as
with last_closed_year_revenue as
(
select
product_id
, revenue
from
{schema}.sales_fact f
where
left(date_num,4) =
(
select
max(left(date_num,4)) - 1
from
{schema}.sales_fact) )
select distinct
p.brand
, p.category
, p.product_id
, p.brand as brand_dupl
, coalesce(sum(f.revenue) over
(
partition by p.brand
)
*-1, 999) as brand_rev_sort
, p.category as category_dupl
, coalesce(sum(f.revenue) over
(
partition by p.category
)
*-1, 999) as category_rev_sort
from
{schema}.product_dim p
left join
last_closed_year_revenue f
on
p.product_id = f.product_id
"""
spark.sql(product_query)
#----------------
# store dimension
store_query = f"""
create or replace view {schema}.store_dim_vw
as
with last_closed_year_revenue as
(
select
store_id
, revenue
from
{schema}.sales_fact f
where
left(date_num,4) =
(
select
max(left(date_num,4)) - 1
from
{schema}.sales_fact) )
select distinct
s.country
, s.customer
, s.store_id
, s.country as country_dupl
, coalesce(sum(f.revenue) over
(
partition by s.country
)
*-1, 999) as country_rev_sort
, s.customer as customer_dupl
, coalesce(sum(f.revenue) over
(
partition by s.customer
)
*-1, 999) as customer_rev_sort
from
{schema}.store_dim s
left join
last_closed_year_revenue f
on
s.store_id = f.store_id
"""
spark.sql(store_query)
After loading the data into the Power BI semantic model and establishing relationships with the fact table, the next steps are as follows:
◾ Applying Proper Sorting of Columns: Once the data is loaded, you need to ensure that the columns are sorted correctly based on the desired sorting order. This can be achieved using the “Sort by Column” feature in Power BI.
◾ Fields Parameter Tables: The fields parameter tables and come into play to facilitate the selection of attributes and define the sorting options. In this context, the pColumns table refers to the calendar dimension view, while the pRows table refers to both the product and store dimension views. pColumns and pRows columns are used in the columns and rows of the matrix visual in the reporting layer.
◾ Linked Tables: linked tables are just distinct values of both sorting and attribute columns of the field parameter tables. They are used in the slicers in the reporting layer.
Here is the final structure of the semantic model used in this example:
You can now see how the solution behaves in the final stage, which is the Power BI report. Take a moment to enjoy this delightful gif that showcases the functionality in action 😅
Don’t forget to subscribe to
and join our Power BI community