Geospatial Data Visualization

How to Map Data on the County Level in Tableau

No longitude/latitude data in your dataset? No problem.

Jamel Dargan
Analytics Vidhya

--

An OpenData map of localities in Virginia, showing Richmond City surrounded on all sides.
Map of localities in Virginia, showing Richmond City surrounded on all sides. All map images generated in Tableau Public (©Mapbox ©OpenStreetMap). All screen-captures edited by the author.

Overview

The geography of counties and county-level equivalents are often oddly-shaped and meandering. A smaller county may be nearly-if-not-completely surrounded by a larger, separate jurisdiction. Using a centralized pair of longitude and latitude coordinates to describe the location of the larger county might, instead, locate a point in the middle of the smaller county. Fortunately, there are ways to get around this problem.

In this article, we will explore the following:

  • Identifying states and counties with Federal Information Processing Standards (FIPS) codes
  • Mapping a dataset without geographic coordinates
  • Animating geodata maps in Tableau Public

Motivation

If you are only looking for instructions on accessible ways to visualize worldwide or country-level geodata, you are in luck. There is no shortage of relevant software and library documentation and tutorials. Information can be a bit more sparse if you are looking into ways to visualize data on a county level.

In this example, we are working with data — for counties and independent cities in Virginia — that does not include longitude and latitude coordinates. In notebooks at the related Github repository, we explore how cases of Coronavirus, consequent hospitalizations, and related deaths in Virginia’s Hampton Roads region compare to those reported in other areas of the state and particularly in the state’s capital city of Richmond. We relied on interactive plotting in Python with Plotly Express, to visualize data for multiple localities (including population data) on a single figure with the option to hover or drill — down for greater detail.

Animated plots used in our previous notebooks enable us to quickly make visual comparisons across multiple localities, over time. Bar-plots and scatter-plots we include clearly show the Fairfax (county) area as having been more severely impacted than other localities. However, the plots do not easily reference some factors likely influencing the spread of the virus. They do not show us that Fairfax borders Washington, D.C., or that Virginia Beach (in Hampton Roads) is a regional tourist destination. This type of information might be better communicated, at a glance, by incorporating relevant map images into our visualizations. That is what we will do in this current exploration.

B.L.U.F.

Five-rows from a subset of our data, captured from a Pandas dataframe of 6 columns and 544 rows.
A subset of data captured from a Pandas dataframe.

We will recall the primary dataset from our previous effort and stay within the same timeframe, for the sake of consistency. We also want to maintain a level interactive publishing ability, comparable to that which we previously attained.

In our dataset, counties are identified by name and by FIPS code. A three-digit FIPS code represents a county or county-equivalent within a particular state. In Virginia, for example, the code “171” represents Shenandoah. Since the same code represents different counties in 13 other states, five-digit FIPS codes may be created by prepending a two-digit state code. The state code for Virginia is “51”, so “51171” is the five-digit FIPS code for the county of Shenandoah.

When we worked with our data in Python, we used the Plotly Express (PX) library. To add geodata to our visual analysis, it would be reasonable to start where we left off. Unfortunately, while PX previously supported generating choropleth maps from FIPS code data, via its “figure factory” method, that method has since been deprecated. Current methods include a GeoJSON-based approach and an alternative Mapbox tile-based approach. These are both solid options, however, either approach requires centering maps on geographic coordinates. We could turn to other Python libraries, such as Folium or Geopandas, but they also require geographic coordinates and/or shapefiles.

Spoiler Alert

To simplify this exercise, we will step away from Python and the Pandas library. Instead, we will visualize our data in the free analytics platform, Tableau Public. In Tableau, we can select a “geographic role” for a column’s data. Using data already built into its map server, Tableau will assign longitude/latitude coordinates.

For a column including state names, you can select the “State/Province” Geographic role to enable Tableau to visualize the data on a map. Similarly, we can assign the “County” geographic role to our Locality field in order to map the data. In the same way, Tableau accepts FIPS codes as valid data for county-level geography.

Let’s take a look at how we can take advantage of this functionality.

Loading Data

Having originally obtained our data in CSV format from the Virginia Open Data Portal, we will use a local version of the data file. Our downloaded subset spans the period from March 17 to July 31, 2020, and descriptions of dataset columns are available on the data portal website.

Preview of the dataset as a table in Tableau.
Preview of data table connected in Tableau.

In Tableau (in this case, the 64-bit Tableau Desktop Public Edition, on Windows 10), we connect to our data file as a text file. As we load the data, we immediately can view the columns, data-types, and values in our dataset. We find Total Cases, Hospitalizations, and Deaths columns with daily values for each Locality sorted by Report Date. The locality indicates the name of the county or independent city represented in each record.

As expected, there is also a column indicating the FIPS (Fips) for each locality. In addition, we have a column indicating the VDH Health District for each locality which we will not need for this particular exploration.

A Light Scrub

A context menu for `Fips` shows a dot next the the current “Number” format and the desired “String” format highlight.
Changing a data type.

The first issue we will address is the Fips data-type, which loads as an integer. We can expand a contextual menu by selecting the symbol located above the column name and choose “String” as the field’s type. Since our data is limited to counties in Virginia, we will also add a State column to our dataset by opening the dropdown menu from the top-right of one of our column names.

A dialog for adding a new column to the dataset includes a field for the column name above  area to define the field content.
Tableau provides a dialog for adding fields to a dataset.

From this menu, we select “Create Calculated Field…” to open a dialog in which we may enter “State” as our desired field name. Below the field name, we can enter a calculation to be performed along the column. In our case, we only want to the value “Virginia” for each row of the dataset.

Next, we select Sheet 1 and rename it. We can see Tableau has automatically made some assumptions, dividing our data table into (qualitative) dimensions and (quantitative) measures. Note that these assumptions may not necessarily suit your precise needs.

Image showing an open context menu for the `Fips` field and a slide-out submenu for selecting its geographic role.
Selecting a geographic role for the Federal Information Processing Standards code.

In our case, Tableau assumes that Fips is a measure to be counted. We need to set the field’s geographic role, so Tableau will correctly recognize the field as a dimension and a unique identifier for a county.

Image showing an open context menu for the `State` field and a slide-out submenu for selecting its geographic role.
Selecting the geographic role for the state dimension.

We follow similar steps to assign the proper geographic role for our State dimension. Note how Tableau has changed the icon associated with Fips from alpha-characters (when the field was identified as a text measure) to a globe symbol, now that we have assigned its role as a geographic dimension.

Visualizing Dimensions

Image of the Tableau workspace, showing a map of Virginia and surrounding states..
The Tableau Workspace.

When we drag an appropriately named geographic dimension onto the Marks card’s “Detail” property, Tableau recognizes the data and generates relevant longitude/latitude values as columns and rows to locate the geodata on a map background. In the image, above, we see a blue point that indicates the intersection of the geographic coordinates

An image of Virginia, filled-in with the color blue, with Tableau’s “Marks card” sidebar visible.
Color-filled map of the Commonwealth of Virginia.

When we change the mark type from automatic to map, Tableau fills-in color for states in our dataset. Our data is limited to the state of Virginia. Of course, we are interested in county-level data.

Virginia, in blue on a large map with county outlines. A open tooltip displays State and Fips data for a highlighted county.
Virginia with county outlines.

With the geographic role for Fips set to “County,” we can drag the dimension onto the Marks card’s “Detail” property to add county borders to the state map. We see that dimensions added to the detail property show in the tooltip for areas beneath the cursor.

Note: Locality is not a recognized geographic role, in Tableau. We will need to set the dimension’s geographic role to “County” for Tableau to recognize the field values as geodata.

Detail of the Virginia map, labeled with its locality names.
Locality names added as labels for mapped areas.

We add names to our mapped localities by dragging the dimension onto the “Label” property. In addition, the locality names will be added automatically to the tooltip that appears as we pass our cursor over the map.

Visualizing Measures

Detail of the localities map, each color-coded for their count of total cases (light-greenish to dark-blue as they increase).
Detail of Virginia localities map, color-coded for their count of total cases.

Now we will visualize Total Cases data on our map. We drag the measure onto the Marks card’s “Color” property, and Tableau colors each locality based on the sum of its cases over the period of observation. A legend is also generated, defining upper- and lower-bound values and color associations. Here, the upper-bound is 1,106,116, which we can see — using the tooltip — is the case-count for the county of Fairfax.

A close-up map image including the independent city of Fairfax, in the middle of Fairfax County, Virginia.
Close-up: the independent city of Fairfax, in the middle of Fairfax County, Virginia.

In contrast, Fairfax City’s total case-count is below 6,000. Had we tried to use a single, central pair of geographic coordinates to define the county of Fairfax, we might have identified a point within Fairfax City (where the sum of total cases is 185-times lower).

Map Animation

Now that we have an idea of how we can visualize our geographic data on a map, we can explore our data over distinct periods of time. Ideally, we will be able to animate our map to view the changes in Total Cases.

Open menu view for the Report Date dimension, on Tableau’s Pages card, highlighted to change the view from Year to Day.
Changing the view with Tableau’s Pages card.

We drag the Report Date dimension onto the Pages card and indicate the period of time by which we wish segment our data views. Our data is reported daily. We will filter by “Day”, to create a page-view for each date.

Pages card encircled (by the author) in red on the workspace sidebar, beneath the map legend.
Highlight by the author.

A new card appears on our sidebar, where we can select a specific date to visualize on our map. We also can change the map’s color-theme, from the legend’s dropdown menu.

Showing localities mapped from green to red, by the sum of cases for each.
Changing colors for our visualization.

Since our dataset comprises such a large number of cases in Fairfax County, colors for our localities of interest (Richmond City and those in the south-eastern, Hampton Roads region) do not appear too different from areas that were not hotspots.

Dialog listing localities. Selected counties will be excluded, determined by a check-mark in the control’s “Exclude” box.
Excluding localities with Tableau’s Filter dialog.

We can add the Locality dimension to the Filters card, to highlight only our areas of interest. Alternatively, we can adjust the center value for our legend's color scale.

The “Edit Colors” control includes options for start and end colors and values, a center value, and color-range reversal.
The Edit Colors dialog allows you to select a palette and a range of values for the map.

Selecting the “Edit Colors” option from Tableau’s legend menu, we will choose a center value of 500, to ensure that we move from green to gold as cases reach 500 for any locality. We also set the legend to use the full range of colors in our palette, segmented into ten steps. You can easily experiment with these settings, to see how they reflect changes in the data over time.

The page-views animation control, including: a page selector, 3 speed options, a manual slider, start, play, and reverse.
Animation control for page views.

You can drag the slider to preview changes. To animate the map, simply select the right-facing triangle below the slider. The animation control includes additional options for adjusting animation speed and direction.

The workspace, map options menu open and showing Hampton Roads localities highlighted over a satellite map background.
Hampton Roads localities highlighted over a satellite map background.

The Tableau Public menu bar contains additional APIs for adjusting the appearance of your visualization, such as projecting your geodata over a satellite image map.

Carry On

Using Tableau to visualize geodata is not limited to FIPS codes: it accepts several options, including GeoJSON formatted data, shapefiles, and of course, geographical coordinate pairs. For practicing Data Science, I am a fan of computational notebooks. However, presenting analysis sometimes calls for a more accessible alternative. Tableau Public lets you export your data views (and animations) to their public servers, where you can make your data visualizations freely available to practically anyone with an internet connection.

--

--