Building a KPI dashboard in Streamlit using Python

A step-by-step guide to creating a Python data analyst portfolio project

Cameron Jones
4 min readMay 6, 2023

Follow me on Twitter and check out the source code on GitHub

Sales KPIs dashboard

Checkout the web app here: https://cameronjoejones-streamlit-sales-dashboard-app-3pmk71.streamlit.app/

In today’s data-driven world, the ability to quickly and effectively analyse data is crucial. Data analysts need powerful tools that can help them gain insights from data and share their findings with others. In this tutorial, I’ll walk you through the process of building an interactive sales dashboard using Streamlit, Pandas, and Plotly.

Streamlit is an open-source Python library that enables you to build interactive web applications quickly and easily. Pandas is a popular data manipulation library, and Plotly is a library for creating interactive visualisations. Together, these libraries make it simple to create an engaging and informative dashboard.

Data for the project

In this project, the data is from Kaggle — you can download the dataset from here: https://www.kaggle.com/datasets/kyanyoga/sample-sales-data or use your own dataset and rebuild the project yourself.

Setting up your environment

To get started, install the required packages:

pip install streamlit
pip install pandas
pip install plotly

I’ll be using the following code to create our sales dashboard. In the sections below, I will explain each part of the code and how it contributes to the final product.

Code

Import libraries and set up the environment:

from typing import List, Tuple

import pandas as pd
import plotly.express as px
import streamlit as st

This section imports the necessary libraries: pandas, plotly.express, and streamlit. Additionally, it imports List and Tuple from the typing module for type hinting.

Set up the page configuration:

def set_page_config():
st.set_page_config(
page_title="Sales Dashboard",
page_icon=":bar_chart:",
layout="wide",
initial_sidebar_state="expanded",
)
st.markdown("<style> footer {visibility: hidden;} </style>", unsafe_allow_html=True)

The set_page_config() function sets the page title, icon, layout, and initial sidebar state for the Streamlit app. It also hides the default footer using custom CSS.

Load and cache data

@st.cache_data
def load_data() -> pd.DataFrame:
data = pd.read_csv('data/sales_data_sample.csv', encoding='latin1')
data['ORDERDATE'] = pd.to_datetime(data['ORDERDATE'])
return data

The load_data() function reads a CSV file containing sales data and converts the ORDERDATE column to a datetime format. The @st.cache_data decorator caches the result of this function, ensuring that the data is only loaded once.

Data filtering and KPI calculation:

def filter_data(data: pd.DataFrame, column: str, values: List[str]) -> pd.DataFrame:
return data[data[column].isin(values)] if values else data


@st.cache_data
def calculate_kpis(data: pd.DataFrame) -> List[float]:
total_sales = data['SALES'].sum()
sales_in_m = f"{total_sales / 1000000:.2f}M"
total_orders = data['ORDERNUMBER'].nunique()
average_sales_per_order = f"{total_sales / total_orders / 1000:.2f}K"
unique_customers = data['CUSTOMERNAME'].nunique()
return [sales_in_m, total_orders, average_sales_per_order, unique_customers]

The filter_data() function filters a DataFrame based on the given column and list of values. The calculate_kpis() function calculates four key performance indicators (KPIs): total sales, total orders, average sales per order, and the number of unique customers.

Display KPI metrics:

def display_kpi_metrics(kpis: List[float], kpi_names: List[str]):
st.header("KPI Metrics")
for i, (col, (kpi_name, kpi_value)) in enumerate(zip(st.columns(4), zip(kpi_names, kpis))):
col.metric(label=kpi_name, value=kpi_value)

The display_kpi_metrics() function takes a list of KPIs and their corresponding names and displays them as metric cards in a row using Streamlit's st.metric() function.

Display the sidebar and filters:

def display_sidebar(data: pd.DataFrame) -> Tuple[List[str], List[str], List[str]]:
st.sidebar.header("Filters")
start_date = pd.Timestamp(st.sidebar.date_input("Start date", data['ORDERDATE'].min().date()))
end_date = pd.Timestamp(st.sidebar.date_input("End date", data['ORDERDATE'].max().date()))

product_lines = sorted(data['PRODUCTLINE'].unique())
selected_product_lines = st.sidebar.multiselect("Product lines", product_lines, product_lines)

selected_countries = st.sidebar.multiselect("Select Countries", data['COUNTRY'].unique())

selected_statuses = st.sidebar.multiselect("Select Order Statuses", data['STATUS'].unique())

return selected_product_lines, selected_countries, selected_statuses

The display_sidebar() function creates a sidebar with various filters (date range, product lines, countries, order statuses).

Display charts and data tables:

def display_charts(data: pd.DataFrame):
combine_product_lines = st.checkbox("Combine Product Lines", value=True)

if combine_product_lines:
fig = px.area(data, x='ORDERDATE', y='SALES',
title="Sales by Product Line Over Time", width=900, height=500)
else:
fig = px.area(data, x='ORDERDATE', y='SALES', color='PRODUCTLINE',
title="Sales by Product Line Over Time", width=900, height=500)

fig.update_layout(margin=dict(l=20, r=20, t=50, b=20))
fig.update_xaxes(rangemode='tozero', showgrid=False)
fig.update_yaxes(rangemode='tozero', showgrid=True)
st.plotly_chart(fig, use_container_width=True)

col1, col2, col3 = st.columns(3)

with col1:
st.subheader("Top 10 Customers")
top_customers = data.groupby('CUSTOMERNAME')['SALES'].sum().reset_index().sort_values('SALES',
ascending=False).head(10)
st.write(top_customers)

with col2:
st.subheader("Top 10 Products by Sales")
top_products = data.groupby(['PRODUCTCODE', 'PRODUCTLINE'])['SALES'].sum().reset_index().sort_values('SALES',
ascending=False).head(
10)
st.write(top_products)

with col3:
st.subheader("Total Sales by Product Line")
total_sales_by_product_line = data.groupby('PRODUCTLINE')['SALES'].sum().reset_index()
st.write(total_sales_by_product_line)

The display_charts() function creates and displays interactive charts and data tables using Plotly and Streamlit. It includes an area chart for sales by product line over time, a top 10 customers table, a top 10 products by sales table, and a total sales by product line table.

The main function:

def main():
set_page_config()

data = load_data()

st.title("📊 Sales Dashboard")

selected_product_lines, selected_countries, selected_statuses = display_sidebar(data)

filtered_data = data.copy()
filtered_data = filter_data(filtered_data, 'PRODUCTLINE', selected_product_lines)
filtered_data = filter_data(filtered_data, 'COUNTRY', selected_countries)
filtered_data = filter_data(filtered_data, 'STATUS', selected_statuses)

kpis = calculate_kpis(filtered_data)
kpi_names = ["Total Sales", "Total Orders", "Average Sales per Order", "Unique Customers"]
display_kpi_metrics(kpis, kpi_names)

display_charts(filtered_data)


if __name__ == '__main__':
main()

The main() function ties everything together. It sets up the page configuration, loads the data, and displays the title, sidebar, KPI metrics, and charts. The data is filtered based on the selected filters in the sidebar, and the KPIs are recalculated and displayed accordingly. The dashboard is executed by calling the main() function when the script is run.

Testing the app/ running locally

To run the app, save the code in a file named app.py, and then run the following command:

streamlit run app.py

Your app will open in a web browser and allow you to see the dashboard in a local environement.

Deploying the App to the web

Simply follow Streamlit’s simple documentation on how to deploy your app to the web https://docs.streamlit.io/streamlit-community-cloud/get-started/deploy-an-app

Happy coding!

--

--