Visualizing multi-dimensional data can cause unique challenges for a data professional. You want all your data included in the visualization, but you don’t want to overwhelm your audience.
What if geography is involved in your visualization? That can pose even more of a challenge because maps are often already busy to begin with and showing exchanges between areas can be additionally challenging.
One of the more common visualizations to demonstrate a geographic many-to-many relationships is a flow map. It often shows the movement of goods between locations or traffic patterns, as two examples. Henry Drury Harness is often credited with creating the first flow map in 1837 that demonstrate train usage in Ireland:
Another way to demonstrate flow is a graphical version of a crosstab query. You may know this visualization as a heat map or a matrix. There are rows and columns for each geographical point and multiple-hued gradients colors to show the relationship between the two areas at their intersection point in the grid:
The matrix above shows the flow between the eight states in Australia used in the research paper I’m reviewing.
There are some pretty clear pros and cons to both kinds of visualizations. Flow maps have the potential to get incredibly busy, making it difficult to read, if there is a great deal of traffic and/or if the studied area is small and has a lot of traffic. With matrices, one can very easily see relationships between the areas but the geography of the area is completely eliminated.
Researchers at Monash University conducted two experiments with their own software named Maptrix in 2017. This software combines the two techniques of flow maps and matrices. The results of these two experiments, comparing these three different methods were rather interesting.
The researchers asked, through an online survey, to answer questions about the data presented in the three mentioned visualizations. Users were asked task-based questions like comparing the total flow and their magnitude between two areas and determine whether a flow is occurring in a stated region.
From the first study, users tended to prefer Maptrix of the three options and flow maps were ranked at the bottom. The same trend held true for answering questions correctly about the data. The countries studied were Germany, New Zealand, and Australia — all picked because of their different sizes and number of states/regions within the given country. One would think that it would be easier to answer questions about Australia, for example, seeing how that country is the least dense of the three and there are only eight states — as opposed to Germany’s sixteen states over less land — but the researchers didn’t see any statistically significant differences in the data as it pertained to country shape.
For the second study, the researchers picked bigger countries — China and the United States — increased the size of the datasets, and compared only the matrices and the Maptrix visualizations. Users answered questions correctly just as well as they did in the initial study but one of the biggest findings was that Maptrix is preferred for design but the matrices were preferred for readability.
The researchers asked for verbal feedback for both experiments and users stated that Maptrix would be easier to use if users were able to interact with the visualization more through highlighting, filtering and/or grouping. These comments were heard and one can do all three actions with the new version of the software. (Please play with the example found here.)
Initially, I was skeptical of the Maptrix idea as I read about a static version of this in a research paper but, after I interacted with the example on the author’s website, I can certainly see the benefits to this visualization approach. Visualizing geographical multi-dimensional data will always be a challenge, but I believe this was a helpful step for data professionals look beyond using flow maps only.
This article is inspired by the following paper:
- Yalong Yang, Tim Dwyer, Sarah Goodwin and Kim Marriott, “Many-to-Many Geographically-Embedded Flow Visualisation: An Evaluation.” IEEE Transactions on Visualization and Computer Graphics 23(1):411–420, 2017. DOI: 10.1109/TVCG.2016.2598885 [PDF]
Click here to view the link to this publication.