Geospatial Visualization: A Comparison of the Many Available Options

Gianna Maniaci
10 min readMar 28, 2020

--

by Melissa Phillips & Gianna Maniaci

At first it can seem overwhelming to jump into a new data visualization software, especially when working with geospatial data for the first time. Geospatial data links features in your dataset to locations on Earth, either by including longitude/latitude coordinates, or by using an address or other reference point to provide a location for your data. We have taken a look at ArcGis, Tableau, Folium, GeoPandas, and Bokeh and created a short summary of each software to guide users when making these geospatial visualizations for the first time.

The data used in this article is the Tree Inventory Point dataset found on the Charlottesville Open Data Website. It was collected in order to give information about all trees existing on public property in the city of Charlottesville during the year 2008. This project was initiated by Charlottesville Parks & Recreation and records the coordinates as well as several other attributes of individual trees. If you would like to access this data yourself, feel free to follow the link below.

ArcGis

ArcGis provides a platform on which to work specifically with geospatial data. It is usually available by subscription, but there are Enterprise accounts for various institutions that may provide free access to institution members.

Maps on ArcGis consist of base maps, on which layers are added that incorporate the data. To begin, the dataset needs to be added as a ‘hosted layer’ because it is too big to add directly to the map. From the home page, you need to go to the ‘Content’ tab and choose “Add item from computer”. Upload the .csv (you have to give it a tag) and then open it in the map viewer.

This is the initial result. Notice the detailed terrain on the base map and the automatic coloring of the top 10 data points by species. There are many base maps to choose from in ArcGis.

From here, there are plenty of options for viewing and analyzing the data. Here is a print-view example with colored and differently-sized markers distinguishing species and canopy size, and a base layer showing household income.

The map can also easily be published on a website or in an app, and it is located online here. ArcGis also makes the map very interactive with minimal effort and provides many data layers to augment the dataset.

Tableau

Tableau is a data visualization software package that can be accessed by monthly subscription, however there are some opportunities to use it for free. These include if you are a student or if you are working on public data. It is known for being easy to use and producing publishable visualizations.

One of the many options Tableau provides for visualizing data is maps. Clearly this is only useful for geospatial data on a map. The process is not difficult, though there is an essential step to make the geospatial data ready to be used. You begin by opening a new workbook and connecting to your .csv file as the data source. The Tree Inventory data set was imported with x and y columns which represented the longitude and latitude. Before you can use this data, the datatype must be changed to make Tableau recognize it as geospatial. To do this, one must right-click on the x and y columns and choose from the drop-down menu to give each a ‘geographic role’ as either longitude and latitude. It is important to check back with the original .csv file to make sure you defined these columns correctly or the points could end up in an incorrect location. Once this is done, the data is ready to be used.

In the first sheet, rows were defined as latitude and columns were defined as longitude, and the following map of basic tree locations was as follows:

From here, there are plenty of opportunities to incorporate more layers of data. We could use color to differentiate between species, size of points to reflect canopy size, and perhaps even change the base map to incorporate demographic details such as household income. The map below shows all of these changes:

Tableau provides the opportunity to explore the data in many ways by simply dragging and dropping feature headings into the active worksheet. The possibilities are endless!

Folium

Folium is a package in python created to help users create geospatial visualizations. It is easy to use as long as the user has the longitude and latitude coordinates of each data point. Each folium map created is interactive, meaning the user can zoom in or out and move around wherever they please.

The tree inventory point dataset was easy to use in folium because it was already in the structure of coordinates. This made it simple to plot on the map. However, folium is unfortunately only able to plot data through the format of a for-loop. This is unfortunate because it caused jupyterhub to lag significantly when trying to plot the tree inventory dataset. For reference, this dataset has about 8000 points. It was almost impossible to even get a picture to put in this article of all of the points. The best way to combat this was to create multiple for-loops and plot different pieces of the dataset one at a time. This method allowed for a picture to be taken before jupyterhub lagged too terribly, but honestly, this was too many pieces of data for folium to run. The map below shows the initial image created when plotting all the points at once:

This obviously does not provide much insight about the data we were given and unfortunately it was impossible to use the interactive feature of folium to zoom in and get a better picture because the for-loops caused too much lagging on jupyterhub.

Next it was important to test out different data augmentation operations. Folium is able to add multiple layers to a map through the use of a geojson file. Because we did not have a geojson file of relevant Charlottesville data, an example was not added to this article, but it is a cool feature in folium.

Below shows a more complex map demonstrating the agency who installed each tree. It does not seem as though it is very easy to add a legend to folium maps. Because the system is interactive, it is easy to add popups which provide more information about the data. A normal legend can be added using html or is easily added if it is a choropleth map, but the best way to add a legend otherwise is through the use of the popup feature.

Folium also has many other data augmentation features like layering which can be done through the use of geojson files. There are plenty of options when using folium, just a little exploration and you will be an expert in no time.

GeoPandas

GeoPandas is an open source project created to work with geospatial data in python. It was created to combine mapping with the preexistent python library, pandas, which is already used extensively for data analysis and visualization. This works by allowing spatial operations on geometric types. The goal of GeoPandas was to allow the user to create maps in python that otherwise would have required a spatial database.

Unfortunately, since GeoPandas is not a spatial database, it requires a little extra work on the user’s part. In order to actually create the map, a person must make use of a GeoDataFrame. This is a pandas.DataFrame that has a column with geometry. The user must plot this geometrical data in order to get the map and then plot their actual points to visualize their data. The only maps that you can access without creating your own shape file are the world and the New York boroughs map.

The below link shows where the data for the base map was received:

The map below is the Charlottesville City shown in gray with every tree plotted on top as a blue star.

You can of course adjust the visualization to show different attributes of the data, but the visualization is pretty plain. It is not interactive and relies heavily on having a descriptive base map, which can be difficult to find. GeoPandas offers a way to create simple maps, but does not create great detail in each visualization. It is a little confusing to get started and took a while to find the necessary base map.

Bokeh

Bokeh is a software library that allows users to create interactive web visualizations. It can be installed on a computer and accessed with python. Bokeh has many tools for displaying data, and the toolset for mapping geospatial data is only a small subset of this. As such, there is still work being done to make it more user-friendly and relevant.

To create a map, one first has to import plotting libraries from Bokeh and tiles from a tile Vendor. Tiles form the base of the visualization, so in this example, the tile ‘STAMEN_TERRAIN’ will form the base map. While most geospatial mapping software uses the Web Mercator projection to show spherical coordinates on a linear grid (i.e. mapping the round earth to a flat, 2d version) without the user even realizing it, Bokeh requires the user to adjust the longitude and latitude data into Mercator units in order to plot the locations on the tile. By setting the axes to ‘mercator’, the markings on the axes will be translated back to regular longitude/latitude coordinates. For this problem, we borrowed the following code from Colin Patrick Reid and adapted our data to match the input form:

After formatting the data correctly, the Bokeh map can be set up with the following code adapted from a Stack Overflow response:

Bokeh provides a lot of formatting options for the output and tools to assist with navigation. Users can adjust the tools to select which ‘zoom’ and ‘pan’ options they would like, and also if they would like a ‘reset’ button. Here is a basic plot of the Tree Inventory data with the ‘pan’, ‘wheel_zoom’, ‘box_zoom’, and ‘reset’ navigation options:

When it comes to adding more depth to the visualization, like resizing the points or changing colors based on some characteristic of the data, there are many ways forward but they are not always simple. The numbers in the canopy column range from 1–3, so this can be used to resize the points, but must be enlarged by a factor to make them visible. Trying to change colors based on species type is more difficult. Here, a ‘color’ column was added to the data and this was plotted with each point by including it in p.circle().

Here is the more expressive map:

There are ways to add more features and build in a legend, but they would require more time and attention to detail.

Bokeh provides many methods to fully customize the final result, and the visualization produced is fully interactive and visually pleasing. Because of the vast amount of options and somewhat abstract code to generate results, the learning curve is steep. A data visualization specialist who is looking for a new tool to make an interactive visualization could do well with Bokeh, though a beginner who wants a quick result with minimal input may want to choose one of the more user-friendly software options.

We hope this information served as a helpful guide when choosing which geospatial visualization software would be best for each project. We wish you luck on your mapping endeavors. Have fun!

--

--