Whose Grid Map is better? Quality Metrics for Grid Map Layouts

Grid Map has become quite popular lately. However, multiple publishers had already developed slightly different map layouts. How should one decide which layout to use? It is also not difficult to create your own map. How could you check that you are doing a good job?

Figure 1. Different US map layouts from six publishers. Visual encoding: Black border = invalid neighbors, Thick orange line = misdirection, Curve line = missing neighbors.

Quality metrics & Intuitions behind them

For these purposes, I have defined a set of metrics for evaluating the quality of a Grid Map layout. These following metrics can be used as guidelines for selecting a layout to use, or optimization goals for new layouts:

  1. Lookalike (Boolean): The overall shape looks similar to the geographic map. I prefer this to be a yes/no question than trying to come up with a number since it only helps with first impressions for viewers to recognize that it is a map of a certain geographical region, but does not affect the interpretation afterwards.
  2. Recall (%) : Regions that are neighbors in reality should appear as neighbors in the Grid Map. For example, North Dakota and South Dakota should be adjacent. A pair of regions that appear as neighbors in the grid map is considered valid if the two regions are really neighbors in a geographic map. A pair is considered invalid otherwise. Recall is computed from this formula:
    No. of valid neighbor pairs / No. of geographic neighbor pairs
  3. Inaccuracy (%) : Regions that are not neighbors should not appear as neighbors in the Grid Map. For example, California and Florida should not be next to each other. 
    No. of invalid neighbor pairs / No. of neighbor pairs
  4. Misdirection (%) : The relative positions between neighbor regions should be close to reality. For example, North Dakota is on the north or South Dakota, so it should be on the north of South Dakota in the Grid Map. To compute this metric, an angle between the two regions in each valid neighbor pair is computed and compared against an angle between their centroids on a geographic map. If the difference is more than 45 degrees, it is considered a misdirection
    No. of misdirections / No. of valid neighbor pairs
  5. Area: A good Grid Map should be compact. This is simply calculated from No. of Rows x No. of Columns

Case study

Six square tile grid map layouts from New York Times, NPR, Guardian, Washington Post, FiveThirtyEight and Bloomberg were selected for the study. (See Figure 1.) These were referenced by the NPR Blog. Of these six, FiveThirtyEight and Bloomberg excluded DC from their maps.

Before reading further, you can take another look at the image of six layouts and choose one layout that you like before reading how it performs.

For the purpose of this study, two square tiles are considered primary neighbors if they share one side of the squares. They are considered secondary neighbors if their corners touch. Both primary and secondary neighbors are included for recall. However, only primary neighbors are counted for inaccuracy.

Figure 2. Neighbors — primary and secondary

Results

All of them, although look slightly different, pass the lookalike test, at least from my judgement. And as mentioned above, there is no point trying to compare which one looks more similar to the US. Let’s look at other metrics which are more interesting.

Figure 3. Recall, Inaccuracy and Misdirection

From the Figure above, New York Times (nyt) receives the highest recall (89.6%). Meaning that it can capture 89.6% of the neighbor relationships represented by a geographic map. Its primary recall is also the highest at 67%, showing that this layout makes best use of adjacent cells. Here are the 11 neighbor pairs that it misses:

MA,VT MA,NY MN,ND MT,SD AZ,CA AR,TX KY,OH NC,TN MD,VA MD,WV PA,WV

It also causes the least amount of inaccuracy (8.28%). These are 12 false neighbor pairs that should not be next to each other:

MA,ME CT,NH MT,WA ND,WY CT,NJ IN,WV DC,OH DE,NJ DC,WV DC,NC LA,OK GA,VA

There are 5 misdirections, which is also the lowest of the six layouts (5.3%). Most notable misdirections are Virginia on the west (180°) of North Carolina instead of north(272°) and South Dakota on the east (0°) of North Dakota instead of south (89°).

NC,VA(272,180) ND,SD(89,0) MA,NH(262,180) ID,MT(332,270) NY,VT(318,270)

On the other hand, NPR receives the lowest recall (82.1%). They rank second to last for both primary and secondary recalls. Guardian has the highest misdirection (12.2%) while NPR puts states that are not neighbors next to each other the most (13.7% inaccuracy).

Figure 4. Area

In term of area, most of the layouts use 8x12 grid and yield an area of 96. The exceptions are Guardian and FiveThirtyEight who use 8x13 and 8x11, respectively.

Conclusions

Based on the results above, the New York Times’ version seems to perform best as it can capture most of relationships while also providing the least misleading visuals.

More importantly, I hope the readers will find these metrics useful and believe that they should be applicable to other types of grid map, such as hexagon layout. It would be interesting to see how they perform compare to the square ones.

P.S. The full list of missing neighbors, false neighbors and misdirections can be found here. The data for this study and code for computing metrics are also available on github at gridmap-layout-usa.

P.P.S. I am not affiliated with any of the publishers above and did not check if they have updated their map layouts in their more recent posts. This study focuses on evaluation method only and you are welcome to reproduce the results with another map layout.

Update (Jan 23, 2016): In the first version of this article, I had a typo in NPR’s data (two MNs instead of MT and MN) which led to lowest scores for NPR in every metric. Thank you Max Goldstein for pointing that out. There was also a typo ZA (AZ) for Guardian. I have corrected the data and updated the analysis.

Update (Oct 20, 2016): The secondary neighbors classification missed some pairs and made the previous results for primary/secondary recalls and inaccuracy incorrect. I have corrected the data, and updated the analysis and figures.