Data Analysis in Power BI from Start to Finish: Dashboard

Shaheer Khan
Analytics Vidhya
Published in
6 min readAug 1, 2021

A step-by-step comprehensive breakdown of Data Analytics and Report Building in Power BI; solving a marketing problem for an online retailer.

Introduction

Back Story:

A targeted marketing campaign needs to be created for a national clothing chain. The need for this has arisen due to flat sales and lost customers. The company wants to advertise specific products to specific customers in specific locations. The company is looking for data driven decision making to target the appropriate customer base. Some of the products they have are:

  • Shirt — $25
  • Sweater — $100
  • Leather Bag — $1000

Goal:
Conduct a data driven analysis to find the best product to advertise to each customer.

Data Sources:

The data sources used in the analysis are:

US Census Bureau Data
-
Average Income
- Location
- Population
- Industry

Business Data
- Product Inventory
- Product Prices
- Customer Rating

Customer Data
- Customer ID
- Names
- Location
- Date of Birth
- Purchase History

Additional Data
- Weather
- Demographics (Industry Salary Data)

Additional Data Gathering

To help better understand the customer demographics and market conditions and to create a more rigorous analysis, additional data was gathered. Two extra data sources were used, one being Weather Data, and the other being Industry Salary Data.

Weather Data Collection

Weather data was collected and cleaned using python and Visual Crossing Weather API. The code can be found on my GitHub.

Industry Salary Data Collection

The Industry Salary Data was collected from U.S. Bureau of Economic Analysis (BEA) and cleaned in Power Query.

Data Processing

Census Data was formatted, cleaned, and merged to create one table using power query.

Customer Data was formatted and cleaned in power query.

Business Data had two tables: purchase list and product inventory. Both were cleaned and formatted in power query. Specifically, the cleaning of purchase list table was the most complicated one hence the process is shown below. The purchase list is the sales history, and this was found in the untidiest format. The purchase list was formatted and cleaned using the unpivot feature as the data was presented in wide format as opposed to long format. Unpivoting was needed to analyze the data in Power Bi. Below is a before and after image of the purchase list table. Here is the detailed reason for converting the tabular structure of the table.

Before unpivoting, the data was found in wide format.
After unpivoting, the data was converted to long format.

Additional Data was formatted and cleaned in power query.

Next, a new table, called the “Regression Table”, was used to predict Customer Incomes using linear regression. Although we don’t know the incomes of our customers, we should be able to predict it by using their purchase history for the last 6 months found in Customer Data and the Average Income per State from the Census Data. This way the predicted income of the customers in the respective states can be found.

Categorizing Data

Consequently, categories need to be created in calculated columns using DAX (Data Analytics Expressions). Firstly, using the product inventory table we create two calculated columns:

  1. Price Categories — This column categorizes products in 3 price brackets:
    - $25 - $349 — Low
    - $350 - $674 — Medium
    - $675 - $1000 — High
  2. Weather Clothes — This column categorizes products in 3 brackets:
    - Winter
    - Summer/Spring
    - Neutral

Then the same process was repeated in the Regression Table where the predicted customer incomes were categorized:

  1. Predicted Income Ranges — This column categorizes incomes in 3 brackets:
    - [$78,883 - $91,599] — Low
    - [$91,600 - $104,314] — Medium
    - [$104,315 - $117,031] — High

Data Analysis/Visualizations

Once all the required data is presented in a clean format, then analysis with the help of visualizations can begin.

Next, essential data visualizations were created to make sense of the data at hand.

These were the key visualizations required:

  • Scatter Plot with r² value to show relationship between Customer Ratings and Product Return Rate
  • Scatter Plot with linear regression formula and r² value to show relationship between Average Household Income by State and Average 6 Months Sales by State
  • Heatmap was used to visualize Household Income distribution across the US
  • Predicted Income histogram shows the distribution and shape of Predicted Income by Category

These are the additional supporting visualizations:

  • Histogram of Predicted Income
  • Product Price Histogram shows the distribution and shape of products by Price Category
  • Box Plot to depict distribution of Temperature in different States
  • Scatter Plot with r² value to deduce relationship between Average Temperature and Total Sales
  • Filled Map showing Average Temperature across the US map
  • Weather Product Category Histogram shows the distribution and shape of products by Weather Category
  • Bar Chart depicting Average Salaries by Industry
  • Stacked Bar Chart to show industry composition by Salaries and by State
  • Tree Map to show the Top 8 Industries by Average Salaries

All these visualizations can be seen in the Power Bi Report.

Results

There is a strong positive correlation between average income and average sales.

Picture from the Power Bi Report — Scatter Plot of Average Income and Average Sales

There is a strong negative correlation between customer ratings and return rate.

Picture from the Power Bi Report — Scatter Plot of Customer Ratings and Return Rate

Customers in these states/locations have the highest incomes:

  • Washington
  • Maryland
  • Massachusetts
  • New Hampshire
  • Illinois
  • District of Columbia
Picture from the Power Bi Report — Income Insights by States

Next, Products in the “$25-$349 — Low” category should be advertised the most. The products in this category include:

Picture from the Power Bi Report — Product Insights by Price Categories

Marketing Strategy:

Since there are a greater number of customers with low to medium income, the products which are low in price should be advertised the most as it covers majority of the demographics.

Using additional data collected, a weather analysis shows that there are more states which have on average more cold weather than warm weather. Summer/Spring products have the highest ratings whereas winter products have the lowest ratings.

Average price is the lowest in the Summer/Spring products category. Therefore, Summer/Spring products should be advertised slightly more than Winter products.

Picture from the Power Bi Report — Weather Insights by Products and States

Next, by using the salary by industry data collected for the respective states. We see that by filtering the Top 5 Industries in terms of highly grossing average salaries, states such as California, New York, Texas, and Florida have the highest average salaries and highest average populations. So, marketing products that are needed in these areas and that are pricier is a reasonable point of focus.

Picture from the Power Bi Report — Salary Insights by Industry

Power Bi Report

Here is a view of the home page of the report:

Home Page of the Report

The full interactive Power Bi visualization can be accessed here.

Thank You for Reading!

I thoroughly enjoyed sharing my journey of documenting the process of creating a rigorous marketing analysis in Power Bi. I hope to have inspired others to document and share their analysis journeys so that analyst like myself can take inspiration and produce more insightful analysis in our careers. If you are looking to get started with data analytics, data storytelling and other data wizardry, make sure to follow me on Medium and let’s connect on LinkedIn. You can visit my website to see my portfolio.

If there is enough demand, then I can write a detailed breakdown of the data cleaning process that was carried out in this project using Power Bi.

References

--

--

Shaheer Khan
Analytics Vidhya

A data enthusiast, looking to extract meaningful information out of this data laden world of ours. Data Analyst | Data Storyteller | Dashboard Designer