Step-by-step guide to your first interactive dash app — Part I

Paolo Molignini, PhD
8 min readOct 24, 2022

--

In his best-seller Homo Deus, Yuval Noah Harari claims that we are entering the age of “dataism”, in which data and information are viewed as a commodity of supreme value. However, data itself is of little value unless it can be visualized efficiently. If a picture is worth a thousand words, a good graph is worth a thousand numbers.

There are many data visualization packages out there, depending on whether you are working with Python, C++, Java, Julia, etc. I naturally gravitated towards Plotly/Dash because of my long experience with Python and because they have become very popular tools with a large support community behind them. Plotly and its web-based app Dash are frameworks based on Python packages (and more recently R and Julia too) that provide a collection of chart types and Html wrappers for interactive data visualization. With Plotly/Dash you can create beautiful figures that can be dynamically changed via buttons, dropdown menus, and more.

The Plotly/Dash community is fairly new (Plotly itself was founded in 2012) and rapidly growing. There is a wide range of plotly-related information on the internet, such as basic tutorials, documentation on how to use single functions, a bustling forum, and many task-oriented tutorials. For big Dash applications one could also resort to Dash Enterprise, Plotly’s paid product which offers more support on building, testing, deploying, managing and scaling. In my experience, however, if you are working on a smaller project and/or don’t have the financial means to pay for Dash Enterprise, linking all of the available (re)sources together into a start-to-finish project can be a daunting task and requires a lot of googling, testing, asking questions to the plotly community etc.

The aim of this guide is then to remove all of these exploratory steps and provide an easy-to-follow example containing all the key concepts necessary to build and deploy an interactive dash app. As an example, I will use my Dash app for visualizing the global spread of the 2022 monkeypox epidemic. Step by step, I will show you how to structure a dash app, how to make it interactive via callbacks, how to perform very simple data analysis, and how to deploy the app online for free. The only prerequisites you will need is a fairly basic knowledge of Python, i.e. you need to have Python, NumPy and Pandas installed on your system (my code is optimized for Python 3.7.10), you need to know how to import packages, define and modify variables, write and call functions, and how to run Python scripts. My app relies also on NumPy — Python’s package for numerical analysis, and Pandas — Python’s package to handle databases, but don’t worry! I will explain each command of the code, so you don’t have to be an expert in either of those packages. The repository for this tutorial is hosted on my GitLab page.

Are you ready? Let’s dive right in!

The data

There is no data visualization without data, so the first thing to do is to gather some interesting data to visualize! My app was designed to visualized data about the monkeypox epidemic, so that’s what we’re going to use.

When the first cases of monkeypox were detected in Europe in May 2022, I was immediately alarmed. Of course, Covid-19 was still very fresh in my (and everyone’s) memory. I feared the same lack of information and inaction by the authorities as when Covid-19 first appeared, specially because the monkeypox infections seemed to spread predominantly among men who have sex with men and we don’t have a very good track record of caring about LGBTQ+ health (e.g. HIV in the 80s). I could not really find a unique source that tracked the cases globally. All I found was newspaper articles covering the story and counting cases in each country separately. I was used to Google tracking Covid-19 infections and vaccinations daily, but there was no similar data visualization tool for monkeypox, probably because it was all so new. So I decided to track the spread of the disease myself by gathering data, visualizing it with Dash and deploying it online. Luckily, other websites now offer comprehensive visualization graphs of the disease spread, such as ourworldindata.org (which became the source for the later data we are going to use).

With the data I gathered, I wanted to show two main things:

  1. I wanted to have a world map that showed which countries were affected by the spread and which not (yet). I also wanted to show how many cases there were in each country on a given date.
  2. I wanted to show a graph of the cumulative cases as a function of time, both for each separate country, and in total. The main reason for doing this was to see whether the infections were increasing exponentially or not. Unfortunately, it turned out that some countries saw an exponential increase at a certain point, although as of now the spread of the virus seems to be decrease, probably because of both the use of the vaccine and an increased awareness of the ways of transmission.

On Twitter, a friend pointed out that it would have also been useful to have a fit of the cumulative cases to better visualize if the trend was indeed exponential or not, and to predict the evolution of the spread in the future. This is something that I also added in the end, together with several buttons to configure the parameters of the fit, the data selection etc., and the sources to my data at the bottom of the page. To get an idea of what we will deploy, you can check my app directly: https://monkeypox-dashboard.herokuapp.com.

Disclaimer: note that the number of cases has not been updated in a while. This is for several reasons. After a couple of months the daily number of recorded cases peaked and started to decrease for almost all the countries. Also, at the same time other websites with more accurate monitoring appeared. In the end I therefore believed that my constant monitoring was not necessary anymore. The information about the monkeypox spread was good and available elsewhere. Nevertheless, I think that my app could be recycled as a learning tool for people getting started with plotly/Dash, so here we are!

But let’s go back to the data. All the visualization types I had in mind required infection cases as a function of country and time (date). It made sense to me then to have the data saved as a series of .csv files (tables) labeled by the date, containing a column of all the countries on the map and at least another column containing the monkeypox cases for each country. Since I was mainly tracking the cumulative cases, I opted for saving only the number of cumulative cases in the .csv files. I used as a template the .csv file already provided with plotly’s Choropleth graph object. This contains a list of countries, their GDP, and their international code. I simply filled it with an additional column containing the number of cases, see the screenshot below.

Note that the Choropleth graphical object requires a location anchor, which is provided by the “CODE” column. The GDP column is instead completely unnecessary. Originally, I had planned to correlate the number of cases with the GDP, but in the end I never got around to it.

It is of course possible to save the same data in a single file instead of many, but ultimately it was easier for me to update the data by creating new files instead of manipulating old ones (and risking to inadvertently delete entries or do other damage). Another perk is that I can use these .csv files to plot the data on a worldmap more or less directly as we’ll see below.

Note that the graph for the cumulative cases as a function of time cannot be plotted directly from these .csv files. We will need to create a new dataframe by going through each file and extracting the data we need. This however is not a hard task.

The file structure and the virtual environment

Now that we have the data in place, we can think how to access it and how to structure the code for our app. Browsing around the internet, I found that the standard convention seems to be putting all the data into a folder named “assets” (from the screenshot below you can see that apart from the cases-*date*.csv files there is another called total-cases.csv that we can generate on the fly — more on this later). In the root folder for our app, which I called “monkeypox-deploy”, we have then the app itself (app.py), the assets, a folder called “modules” that contains all the other python commands and functions we need to make the app work, and several other files associated with the deployment on Heroku, including .git and .gitignore. For now, we will focus only on the app and its assets.

Structure of the repository with separate folders for assets and modules

However, there is another useful step to take before starting to look at app.py: creating a virtual environment. If you’re not familiar with python virtual environments, this is the corresponding python doc and this is a nice guide on realpython.com that explains what they are and how to use them.

In a nutshell, a virtual environment is a separate collection of python packages that can be summoned for a specific project. One of the main advantages of using virtual environments is that it keeps your dependencies nice and tidy. In other words, you can use different packages and even different versions of the same packages for different projects without polluting your system with incompatible versions. One project needs numpy 1.20 but another needs 1.23? No problem. You can create different virtual environments and activate them only when you need to work on one or the other project.

If you look again at the screenshot of the folder structure, you can see that in monkeypox-deploy there is another folder called monkeypox-deploy-venv. This is the folder containing the virtual environment. To create it, you can simply type in the terminal

python3 -m venv monkeypox-deploy-venv

What this command does is to run the venv module as a script in python, which will create the directory monkeypox-deploy-venv, and also create other folders inside it containing a copy of the Python interpreter and various supporting files. Once the virtual environment has been created, you can activate it via the shell

source monkeypox-deploy-venv/bin/activate

All done! Our virtual environment is ready and we can start installing all the packages we need for our dash app.

Package installation

Before we start with implementing the app, let’s make sure we have all the required packages installed on our system, for instance via Python’s package manager (pip):

pip install pandas
pip install numpy
pip install plotly
pip install dash
pip install dash_bootstrap_components

We will also use the datetime package, but this should already come with your Python installation. Also, once we are ready to deploy our app online, you will need to download Heroku, Postgres etc. but we will do it once it’s time to do so, so for now don’t worry about it.

You may also want to download my repository so you can directly play around with it. You can do it either via git if you know how to use it, or by simply downloading from the link above all the files to a local folder.

Ok, now you should have all the packages installed on your system and the correct folder structure for the app. In the next post, we will start constructing the basic components of the app itself!

Thanks for reading :)

--

--

Paolo Molignini, PhD

Researcher in theoretical quantum physics at Stockholm University with a passion for programming, data science, and probability & statistics.