Visualize Inflation World Data on Plotly Interactive Map — Colorscale Solutions for Outliers — Beginner-Friendly Guide

Sabina L
6 min readJul 17, 2022

--

If you want to make an interactive plot of country-level data — you are in the right place. While we can create a plot for a single snapshot in time, we can also add a scroller and show annual progress of the data. Feel free to press play button and toggle data bins on the right and visualize annual inflation changes world-wide — this is the plot we’ll make by the end of this tutorial.

Anyone with basic python skills can follow along — it’s a beginner-friendly guide with some bonus material in the end.

In this guide we will use two data sources and will cover 5 sections:

  1. Plotly installation and other required packages
  2. Get & preprocess data: two example datasets
  3. Example 1: Point-in-time snapshot Choropleth (map) plot
  4. Example 2: Interactive map by year — 3 versions
  5. Bonus: save, export, publish your plots

For beginners, package installation could be the biggest hurdle, hence we devote some time to this in this guide.

1. Plotly installation and other required packages

Open the terminal window. Next, if you are using Anaconda (or any version of it) for environment management run the first line in the text box below. Otherwise, run the pip installation via the second line.

conda install -c plotly plotly=5.9.0pip install plotly==5.9.0

Installation and package documentation reference for plotly is linked here.

Additionally, we will be using pandas and numpy. For the bonus part, we will use datapane package. You can set this package up, link it to your GitHub and use it to share plots on Medium using this straightforward guide.

❗️You should be running this package from jupyter notebooks to avoid issues with displaying your plotly interactive charts.

Package imports:

2. Get & preprocess data: two example datasets

For both examples, we have countries properly labeled by their full names. Hence, we do not need to relabel anything. But we need to properly read-in data and add some extra code that will allow us to create choropleth charts with ease.

For our first example, we want to see all the latest inflation prints worldwide indexed by full country name. This can be found in Trading Economics. You can simply copy the data into your excel file if you do not have a subscription.

For the second example, we want to see how inflation changed year-to-year around the world. Hence, we will use the annual inflation data published by World Bank, which is currently available for most countries up to 2021. Note that you should download and unzip a csv file.

Your data frame should have the following structure:

3. Example 1: Point-in-time snapshot Choropleth (map) plot

Here is the code that created our first example. Note that we want to plot the ‘Last’ column of data frame, hence we assign it to zparameter. We also use locationmode = ‘country names’, but if your data comes in different formats, there are a few more default options available.

Colorscales can be customized through colorscale parameter — available colorscales here.

We can also include additional text that will hover over the countries using text and feeding selected columns into this parameter. Note that country names specified in locations will be displayed by default.

If you wish to display world map using another projection, check out the available projections here and change ‘equirectangular’ to the one you want inside the projection within the geo dictionary in the last code snippet.

4. Example 2: Interactive map by year

This is more challenging than the first example, especially if your data contains outliers like the inflation dataset does. As discussed, we will use World Bank inflation dataset here to get data year by year. We will create 3 versions of this plot to address certain data issues that we could encounter:

a) Basic inflation interactive annual chart without special treatment of outliers

b) Use choropleth on log-transformed inflation (addressing outliers)

c) Use custom bins (addressing outliers)

Let’s start with:

a) Basic inflation interactive annual chart without special treatment of outliers

First, if your data doesn’t have extreme outliers, it’s easy to generate the choropleth chart animated by year.

The most important trick in the code below is to set custom range_color so that we capture the min and max of the observations across the entire time. This way, the colorscale will stay the same regardless of the range of data values from year-to-year, which makes it comparable across time.

The problem with applying a standard colorscale on the data that includes several extreme outliers, is that we can’t see much difference in colors between the small values. In our case, inflation dataset is comprised of comparatively small values, below 50%, and a few 500%+ extremes, which stretched our color scale too much. While we can try to find a different colorscale (like magma), it doesn’t actually solve the problem for our dataset.

b) Use choropleth on log-transformed inflation (addressing outliers)

One way to solve this issue is to transform the variable, for example using log-transform. Let’s compare the histograms of inflation observations with and without log-transform. It’s hard to see the extremes in red, but they are present in the original inflation data, which also has a massive concentration around 0 values.

The log-transformed data doesn’t have such a problem — the distribution of values is much closer to normal and is not overly concentrated around a few datapoints. Hence, when applying a continuous colorscale to the log-transformed inflation, we will be able to see the color differences much better than using the original inflation data.

Hence, using log-transformed inflation will be helpful to make a more readable map, which we did here:

Here, the important part is to add the original non-transformed inflation rate to the hover data using hover_data parameter and setting color to the ‘Log Inflation Rate %’ column name.

c) Use custom bins (addressing outliers)

Another way to address outliers is to group observations into several bins based on range and assign a discrete colorscale. Note that you can de-select the bins you do not wish to examine on the map by toggling the respective bins on the legend. This way you can track the progress of inflation only in the bins of your choice. We have clearly upped the contrast in our chart, making it easier to see the differences across the globe.

We’d need to work a bit harder on this one. First, we create custom bin endpoints. Next, assign these bins to each row of dataframe. Next create a separate category with the bin order — it is important as we need to pass a sorted dataframe to avoid issues with the order of the legend pinpoints.

We also create a discrete color scheme or RGB colors with white color assigned to nan values which we need to pass to color_discrete_map.

Other notable parameters include animation_group which needs to get ‘bins’ column converted to string type. You can also update the title of the legend using update_layout method shown below.

5. Bonus: save, export, publish your plots

You can download your plot as a static image with formats like png, pdf, jpeg, etc. To avoid low-resolution image, we can specify the height and width.

To keep the option to interact with the plot, you can download it as HTML. See plotly export guide for more options.

Finally, if you wish to publish your plot to medium or other platfrom, I highly recommend looking into setting up datapane account and link it to your github. It will take you less than 10 minutes if you follow the instructions from this article.

Thank you for reading! Feel free to reach out with questions and suggestions via the comments section.

References:

Plotting:

Data:

--

--

Sabina L

Curious about things | Investment Management | Quant Finance | Quantimental Investing | Data Science | Math | Masters Student | CFA