Automating Excel Dashboard Creation with Python

Uma Rajagopalan
5 min readJan 13, 2023

--

To all those learning about data science and analytics out there, if you haven’t learned automation yet, this article is going to make your life easy!

A Little Backstory:

I’ve been attending interviews for 6 months now for data analyst and data science positions. All my interviews usually have the same pattern or rather I started observing them like how any data enthusiast does. There’s one “tell me about yourself” that’s common and based on my answer, there are further questions on it or my resume.

I interned at a company as part of my capstone project where I helped the company with its data analytics and data science requirements. A major aspect of my role was to generate Key Performance Indicators (KPIs) to measure the performance of the company on a monthly basis. To anyone who does not know what KPI is, this is what Wikipedia says:

Now, this statement is enough for the person interviewing me to get intrigued and tell me how that aligns with what is to be done for the position I’m interviewed for and ask me further about it. And then, the question I dread would be asked.

“Did you implement automation for generating your monthly reports?”

At this moment, I just want to be honest with them and tell them that I have not and how I really regret not doing it! A person who knows Python, SQL, and Excel does not automate to make their life easy? Really? Plus, I have a software engineering background!

So, I decided to change that today. I automated generating my excel reports. I finally did it!🥹

The Real Stuff and just a little Fluff

Through this article, you’ll see how we can automate generating an excel dashboard and creating a simple data pipeline with just a click of a button.

Well, it’s not just one button — it’s a few lines of code!

Like anyone who starts their data analytics dashboard creation journey, I started with my man Alex the Analyst’s easy excel dashboard on youtube.

This 40 min video has helped me so much in understanding the beauty of data. I’m going to use the dataset used in this video and automate the creation of the dashboard made.

I’ll be using the python library — openpyxl

  1. Let’s read the excel file first.
#Reading the Excel file and the sheet name
file_name = 'Bike_Sales_Playground.xlsx'
bike_df = pd.read_excel(file_name,sheet_name='bike_buyers')

2. Alex then creates a working sheet to not mess with the raw data.

#We don't want to mess with our raw data, thus, making a copy of it into a sheet called Working_Sheet.
with pd.ExcelWriter(file_name,#Name of the Workbook
engine='openpyxl',#Name of the engine
mode='a',#Append mode
if_sheet_exists="replace" #Replacing the sheet if it already exists
) as writer:
bike_df.to_excel(writer, sheet_name='Working_Sheet',index = False)#Setting index to False to avoid the unnecessary column Unnamed:0

#Let's read the working sheet data into our dataframe
bike_df = pd.read_excel(file_name,sheet_name='Working_Sheet')

3. Data Cleaning

#Dropping duplicates from the data
bike_df.drop_duplicates(keep='first', inplace=True, ignore_index=False)

#Replacing M to Married and S to Single in Marital Status column
bike_df['Marital Status'] = bike_df['Marital Status'].replace('M','Married').replace('S','Single')

#Replacing F to Female and M to Male in Gender column
bike_df['Gender'] = bike_df['Gender'].replace('F','Female').replace('M','Male')

#Viewing the changed column values
bike_df.head()

#Age is better in brackets
bike_df['Age brackets'] = bike_df['Age'].apply(lambda x: 'Less than 30' if x<=30 else('Greater than 55' if x>55 else '31 to 55'))

#Replacing Commute Distance value 10+ Miles to More than 10 Miles
bike_df['Commute Distance'] = bike_df['Commute Distance'].replace('10+ Miles','More than 10 Miles')

4. Exploratory Data Analysis

Alex created 3 pivot tables and charts in his video. Below is the code for the bar chart. The 2 other line charts have similar codes which can be found on my GitHub repository.

#Pivot table 1
#Average Income per Gender based on Purchased Yes or No
avg_gender_income_df = np.round(pd.pivot_table(bike_df,
values = 'Income',
index = ['Gender'],
columns = ['Purchased Bike'],
aggfunc = np.mean
),2)

#Now that we have made all changes in the dataframe, let's load it into the excel file
with pd.ExcelWriter(file_name,#Name of the Workbook
engine='openpyxl',#Name of the engine
mode='a',#Append mode
if_sheet_exists="replace" #Replacing the sheet if it already exists
) as writer:

avg_gender_income_df.to_excel(writer, sheet_name='Average_Gender_Income')

# loading workbook and selecting sheet
wb = load_workbook(file_name)
sheet = wb['Average_Gender_Income']

# Bar chart creation
chart1 = BarChart()
chart1.type = "col"
chart1.style = 10
chart1.title = "Average Income by Gender and Purchase Data"
chart1.y_axis.title = 'Gender'
chart1.x_axis.title = 'Income'

#Attach the chart to the worksheet
data1 = Reference(sheet, min_col=2, min_row=1, max_row=3, max_col=3)#Including Headers
cats1 = Reference(sheet, min_col=1, min_row=2, max_row=3)#Not including headers
chart1.add_data(data1, titles_from_data=True)
chart1.dataLabels = DataLabelList()
chart1.dataLabels.showVal = True
chart1.set_categories(cats1)
chart1.shape = 4
sheet.add_chart(chart1, "A10")
wb.save(file_name)

5. Creating the Dashboard

#Creating an empty Dataframe
title_df = pd.DataFrame()

#Now that we have made all changes in the dataframe, let's load it into the excel file
with pd.ExcelWriter(file_name,#Name of the Workbook
engine='openpyxl',#Name of the engine
mode='a',#Append mode
if_sheet_exists="replace" #Replacing the sheet if it already exists
) as writer:

title_df.to_excel(writer, sheet_name='Dashboard')

# loading workbook and selecting sheet
wb = load_workbook(file_name)
sheet = wb['Dashboard']
for x in range(1,22):
sheet.merge_cells('A1:R4')

cell = sheet.cell(row=1, column=1)
cell.value = 'Bike Sales Dashboard'
cell.alignment = Alignment(horizontal='center', vertical='center')
cell.font = Font(b=True, color="F8F8F8",size = 46)
cell.fill = PatternFill("solid", fgColor="2591DB")

#Adding all our pivot charts to the dashboard
sheet.add_chart(chart1,'A5')
sheet.add_chart(chart2,'J5')
chart3.width = 31
sheet.add_chart(chart3,'A20')
wb.save(file_name)

6. Let’s look at our dashboard

Bike Sales Dashboard

7. It’s time to put all this into a function and execute it

import Bikes_Sales_Report_Automation as auto

#Enter your filename here
auto.automate_excel_dashboard('Bike_Sales_Playground.xlsx')

8. Click on the button to execute line 7 and voila! Your dashboard is created

Summary:

We created a simple function that when executed generates an excel dashboard automatically. You can run this function anytime you want to generate a report on a periodic basis like every month for generating KPIs in a dashboard.

Limitations:

I wish I could figure out how to add the cool excel slicer that Alex had in his video but openpyxl has a limitation to adding a slicer. If you know how to add excel slicers using python, do let me know!

Check out the code and final output here.

--

--