Gestalt Grouping Principles for Scientific Figures

How to make them work for you, or at least make sure they’re not working against you

Pattern
8 min readMay 6, 2022

By Mary O’Reilly

Researchers often come to us at Pattern because they feel that a figure is not successful but they’re not sure why. Others will ask in a different way — “Can you make this prettier?”. We try to show them that beauty is a byproduct of good design, which is in service of the primary goal — communication. In this article, I use a case study to demonstrate how using the Gestalt Principles of Grouping can help, but also how they can hurt.

The term Gestalt refers to an organized whole being perceived as different from the sum of its parts, just as the three PacMan shapes in Figure 1 combine to make us perceive a triangle that isn’t there.

Figure 1. Gestalt psychology argues that the whole is different from the sum of its parts

The Gestalt principles of grouping are a branch of this theory that illuminates how human cognition leads us to perceive objects as being in groups based on the principles of proximity, similarity, continuation/closure, connection, and enclosure, among others. You certainly don’t have to use these principles for your scientific figures, but if you’re not aware of them, you may be unwittingly using them against yourself.

The Problem

The Khera Lab at the Broad came to us because they’d created a figure for their paper¹ on using DNA markers of ancestry to improve the accuracy in DNA-based prediction of coronary artery disease risk (Figure 2). The figure said everything they wanted to say, but they still weren’t happy with it. The first question we like to ask is: what is the goal of this figure?

Figure 2. Original figure depicting analysis of ancestry-adjusted CAD risk

Their goal was to illustrate that there is a population-level analysis (top two plots) that is separate from an individual-level analysis (bottom two plots). In other words, they wanted a group that included the top two plots and a group that included the bottom two plots. What they didn’t realize was that our brains were being told something different.

Gestalt Principles at Work— Proximity, Connection, and Continuation

First of all, by the principle of proximity, objects located closer together in space will be perceived as being in a group (Figure 3A-B). The plots in the original figure appear to be grouped in vertical pairs, rather than the desired horizontal pairs, causing us to see a left hand group and a right hand group instead of a top and bottom group. Secondly, Figure 3C illustrates how connecting objects with lines also creates the illusion of groups through the principle of connection.

The original schematic figure has all of the plots connected by arrows, effectively creating one big group. Finally, due to the positions of those arrows and the principle of continuation or closure (Figure 3D), our brains fill in the gaps to perceive a rectangle behind the plots. This unwitting rectangle is anchoring and binding together the four plots, again, into one big group. The good news is that all three of these issues were very easy to fix once we recognized them.

Figure 3. Grouping by proximity, connection, and continuation/closure

Figure 4 shows how simple remodeling of the original figure can break the unwanted groupings and favor the desired one by 1) flipping the proximity and 2) detaching the top plots from the bottom ones — both breaking the connection and removing the sneaky rectangle.

Figure 4. Reversing proximity and breaking unwanted connections promotes desired grouping

Gestalt Principles at Work — Similarity

We’re not done yet though. As you can see in Figure 5, another Gestalt principle of grouping is the principle of similarity. Color and shape are just two examples shown here. Similar colors will be grouped together perceptually, just as similar shapes will be.

Figure 5. The principle of similarity groups objects similar in shape and color

If you look at the way the authors used color in the title headers of the plots, these nicely follow the principle of similarity and group the plots appropriately.

However, both shape and color similarity might force our brains to group together the PCA data plots into one group (green) and the histograms into another group (orange). We can’t do anything about the shape of the data, but the colors don’t serve any purpose here. What if we just took them away (Figure 6)?

Figure 6. Removing the color that created undesired groupings

You might be wondering why we didn’t just flip the colors the way we flipped proximity and that is a great question. We could have made the data in the top group one color, and the data in the bottom group a different color to reinforce the horizontal grouping. But when we got to this nearly colorless figure, it opened us up to using color in a completely different and more meaningful way.

With this in mind, we considered other ways of using color as an encoding system instead of for grouping. As illustrated in the sketch in Figure 7, we decided to use color to encode the spectrum of different ancestry markers (rainbow color scale) and different CAD risks (greyscale).

Figure 7. Using color to put data into context rather than for similarity

Looking at the right-hand side of the sketch first, every participant has their own position on the histogram’s x-axis based on their DNA markers of CAD risk — and thus every participant is associated with a particular shade of grey. On the low end of the spectrum, participants’ DNA markers are associated with a relatively lower risk of CAD, while the high end is populated by participants whose DNA markers are associated with a higher risk of CAD. Most people, as the histogram indicates, have an average risk. With this reference histogram, a new participant could have their DNA markers analyzed and be assigned a polygenic risk score.

Every person in the analysis also has their own x/y coordinate on the rainbow-colored principal component analysis (PCA) plot — and thus a unique color. These plots represent the participants’ ancestry as determined from a separate set of DNA variants — similar to the sites of DNA variation that commercial genealogy companies use to tell you that you are 5% Iberian or that you have Viking ancestors, for example. Each point on the plot is a participant, and the closer together the points are on the plot, the more similar their genetic markers of ancestry. This measure of ancestry was added to the study to improve the accuracy of risk prediction. If you want an accurate cooking time for boiling macaroni, you should take into account your altitude because the boiling point of water is different at different altitudes. Similarly, subtle differences in ancestry can affect one’s risk of developing CAD.

These spectra of color or greyscale help to give the data context. They show how each individual contributes to the population-level analysis, and also how one could then use these reference plots to assign an ancestry-adjusted score to a new participant in the individual-level analysis. However, through the principle of color similarity, we’ve now edged back towards a left-hand group and a right-hand group. We needed to use a stronger principle to group the population and individual-level analyses.

Gestalt Principles at Work — Enclosure

Of all of the Gestalt Principles, the strongest is enclosure (Figure 8A). If you draw a boundary around a group of objects, you almost can’t help but perceive it as a group, even if there is a mix of different colors or shapes. We could have drawn a box around the top or bottom plots, but we began to wonder whether we even needed to have two sets of plots. Maybe we could use the people to distinguish individuals from population, which might be more intuitive. To do this, we drew a circle around the individual, leaving the population outside of the circle, using the principle of enclosure (Figure 8B).

Figure 8. The principle of enclosure is the strongest

This method of nested enclosure, with the population on the outside and the individual on the inside, also enabled the two groups to interact with one another in space, illustrating how the individual-level analysis draws on the population analysis (Figure 9). The individual’s coordinates are found on the population genetic ancestry reference (PCA plot), and that information is applied to their polygenic score from the histogram to calculate an ancestry-adjusted polygenic score. Now there was no need to duplicate the reference plots. To clarify, we used strong typography delineating population and individual to reinforce the grouping by enclosure.

Figure 9. The original (A) and final redesigned figure (B) grouped by enclosure

Finally, the map was added at the Khera group’s request because this study was restricted to South Asian participants. While this type of risk prediction works well in European-derived populations, the predictive power is less strong in populations whose genomes are underrepresented in genetic databases. They wanted to address this inequity. South Asians suffer disproportionately from CAD and, even though they account for 23% of the global population, were only represented in 1.8% of genetic association studies at the time of this study’s publication. This Eurocentric bias makes it much more difficult to predict disease in the South Asian population, so the Khera group confined their analysis to nearly eight thousand South Asian participants. To drive home the impact of these results, the top 5% of South Asians were identified with 3-fold greater risk.

Conclusion

To truly have an impact though, results like these need to be communicated. The Gestalt Principles of Grouping are a powerful tool in a designer’s toolbox and anyone can use them. They can reduce the viewer’s cognitive load by doing some of the work of perception for them behind the scenes. They may not be needed for every figure, but an awareness of them may save your figure from being misperceived.

  1. Minxian Wang and Amit V. Khera, et al. Developing Genome-wide Polygenic Risk Scores for Coronary Artery Disease in South Asians. J Am Coll Cardiol. 2020 76(6): 703–714.
  2. For more reading on Gestalt Principles of Grouping: https://careerfoundry.com/en/blog/ui-design/what-are-gestalt-principles/

--

--

Pattern

We are a group of designers, engineers, and scientists specializing in visualization at the Broad Institute of MIT and Harvard ( @broadinstitute ).