Top 10 data visualisations for schools

Rich Davies
10 min readNov 19, 2019

--

During my six years at Ark Schools, I’ve been tasked with analysing and sharing a lot of data. My audiences have typically been network and school leaders, as well as middle leaders and teachers — i.e. busy people facing multiple demands on their time. My objective has therefore been to get them past the data and to the insight as quickly as possible.

Schools and networks up and down the country are awash with data, but it is usually contained within various tables that cannot be interpreted quickly. To be fair, tables serve an important purpose in any organisation’s data infrastructure, especially when we need to:

  • be precise
  • look up individual values
  • show both detailed and summary values
  • include multiple different units of measure

However, these conditions often don’t apply when schools are trying to make data-informed decisions. As such, we can usually derive insights more quickly and clearly from data visualisations than from tables.

But not all data visualisations work well in all contexts. If we choose an inappropriate visualisation for a given analysis, it could easily prove less useful than a simple table.

We therefore need to choose the most appropriate visualisation depending on the analysis required. While there are myriad analyses performed within the schools sector, the following ten types represent the vast majority of what I have worked on during my time at Ark.

1. Comparing values between groups

e.g. Which schools have the highest/lowest pass-rates?

This is probably the most common analysis I perform, be it comparing networks, schools, subjects, year groups or any other common differentiator. The Horizontal Bar Chart is therefore my go-to visualisation, almost always sorted from largest to smallest (unless there is a meaningful alternative convention). Labels can be comfortably scanned as a list and bar lengths can be easily compared by eye.

Sometimes I’ve needed to add a comparison to some secondary data — e.g. targets or predictions. Reference Lines make these secondary comparisons visible without losing the primary comparison shown via the bars.

One critical rule when using bar charts is to never break the axis above zero, since doing so would make the relative length of each bar somewhat arbitrary. However, since it is sometimes necessary to compare small differences between large values (e.g. for attendance rates), I would recommend switching to a Dot Chart. Since these charts involve comparing positions rather than lengths, it is a lot less problematic to cut their axes above zero.

2. Observing trends over time

e.g. Which schools’ pass-rates have been increasing/declining each year?

Apparent school performance can be volatile, so trend analyses help us distinguish short-term glitches from longer-term patterns. But multiple time periods mean more data-points, making things harder to mentally process. A Line Chart helps avoid mental overload by consolidating these multiple points into a single pattern. We can even distinguish between a few (but not too many) different patterns by using colour to differentiate between multiple lines.

If we want to show that a second characteristic (with relatively few possible values) is also changing over time, we could opt for a Vertical Bar Chart and use colour to represent the second variable. Either way, the key rule when visualising trends is that time should flow from left to right, not vertically.

3. Comparing shares of a total

e.g. What proportion of my school’s students belong to each ethnic group?

I have sometimes been asked to analyse the composition of a given population — i.e. breaking down the students within a given school along some characteristic or other. Whenever I’ve asked why, it’s usually been in order to identify the relative size of different groups of students. As such, this analysis is not actually that different from the simple comparison of groups already discussed above. The only distinction is that the sum off all the values being compared adds up to 100%. One way of subtly indicating this distinction is to use a Gapless Horizontal Bar Chart. While not strictly necessary, removing the gaps between bars helps signal that this is not just any old bar chart.

I often see pie charts used for this type of analysis, but since people are much better at comparing lengths than angles, they are almost never the best option. This is doubly true for 3D pie-charts, which actually misrepresent the relative size of each item. As a rule of thumb, going 3D is usually a bad idea.

4. Comparing shares of totals between groups

e.g. How do grade distributions differ across different schools?

While it is sometimes interesting to analyse the composition of a single population, it is usually more insightful to compare the compositions of multiple ones. We do this a lot within Ark, which is why so many of our dashboards include a Horizontal Stacked Bar Chart.

Often, the way we categorise our data will have some kind of in-built hierarchy (e.g. ‘Highest performing to lowest’ or ‘Most advantaged to least’). When this is the case, it is helpful to sort these categories from left to right. This type of sorting is particularly useful if we ever need to quickly compare against various thresholds (e.g. ‘Students with an A or above’, ‘Students with a B or above’, ‘Students with a C or above’ etc).

5. Observing a trend in shares of totals over time

e.g. How has my school’s English/Maths cross-over changed each year?

Given the rule of thumb discussed above around time always flowing from left to right, the one use-case where I would recommend a Vertical Stacked Bar Chart is when we need to analyse how a population’s composition has changed over time.

As with its horizontal counterpart, this type of visualisation also benefits from sorting categories in line with any in-built hierarchies. However, in this case, the most useful sort-order is from bottom to top (not top to bottom), since this again lends itself to easier comparison against various thresholds.

6. Comparing full distributions between groups

e.g. How do raw mark distributions differ across different classes?

While it is usually possible to group educational data into various buckets or categories, it is sometimes useful to keep the original level of granularity (e.g. raw test scores) when analysing our data. This is when I would use a Horizontal Strip Plot, which shows a separate dot for each individual value. If multiple data-points have the same value, the dot simply gets bigger.

The purpose of this plot is to get a sense of how distributions vary between groups, but sometimes we need a bit of help to outline the shape of each distribution. This is when I would use a Box & Whisker Plot, which effectively overlays a regular strip-plot with a few lines to show the minimum, median (i.e. half-way) and maximum values — as well as the 1st and 3rd quartile values (i.e. where the bottom 25% of values are below and the top 25% of values are above). Of course, while this helps provide a sense of how the distributions compare, it is only a simplified outline. But sometimes that’s all you need.

7. Observing a trend in full distributions over time

e.g. How have raw mark distributions changed each year?

If you’ve read this far, you can probably guess what my advice will be for this type of analysis. It’s a full distribution, but we’re looking at it over time rather than comparing groups at the same time. As such, I would recommend a Vertical Strip Plot.

Again, overlaying boxes and whiskers can help track how the shape of a distribution evolves over time, but will not reflect its every nuance.

8. Comparing values between groups along two or three measures

e.g. How do schools’ pass-rates compare with their attendance rates?

The above analyses have typically allowed us to look at one measure at a time, but we often want to look at combinations of measures (e.g. English results & Maths results) or even identify potential relationships between them. When we are primarily interested in two measures, I would use a Scatter Chart.

Packages like Excel will happily draw a line of best fit through any plot, but I would suggest removing these unless they serve a purpose (i.e. if there is a decent correlation). And, needless to say, correlation cannot prove causation, so beware of misinterpreting the implications of any visible relationships.

It is also possible to represent a third measure by ‘upgrading’ to a Bubble Chart, which sizes each dot based on this third measure. I typically find that this works best when the third measure reflects some kind of size (e.g. number of students) and only needs to be known to a low level of precision (i.e. ‘Lots of students’ vs ‘Not many students’), since people are surprisingly bad at accurately comparing 2D areas.

If we really need to represent a fourth variable, we could use colour, but this risks overwhelming the audience — so only use with caution.

9. Comparing values between groups across multiple measures

e.g. How have schools performed along multiple different measures?

I have previously argued that educational performance cannot be captured by any single measure, so we should use baskets of measures to broaden our understanding. But how best to visualise multiple measures at the same time? I would advocate trying a Heat Map in the first instance. These look a lot like tables, but have the advantage of using colour to help us quickly identify higher/lower values within each measure.

I’ve sometimes seen heat maps with bold red/amber/green colour schemes, but I have invariably found myself looking at all of these colours at the same time, which was probably not the author’s intention. One way to avoid this is to use a single hue (e.g. blue, orange, grey etc) but increase the intensity (i.e. darkness) of that colour in line with the measure value. Our eyes are then inevitably drawn towards the darkest values, which are usually the most important ones to be scrutinised. This approach is also more printer-friendly.

Alternatively, we could align several bar charts — one for each measure. We could sort all bars based on a single primary measure, thus identifying if any other measures show a similar sort pattern (which would suggest they are correlated). Or we could sort each measure individually, and then use a common colour scheme to quickly ‘trace’ how a given group’s ranking varies across each measures. Which of these is most appropriate really depends on the specific objectives of the analysis in question.

It is also possible to perform a multi-variant regression analysis in order to better understand the relationships between multiple measures. However, since this is not a form of data visualisation, I won’t be discussing it here.

10. Observing variation across geography

e.g. Where are higher/lower attending students more/less likely to live?

When I started at Ark, I thought I was going to do a lot of mapping, but in reality I have only done so a few times over the years. Nevertheless, we may sometimes need to check how a measure relates to geographic locations (e.g. particular neighbourhoods or regions), which is when a Geospatial Map can illuminate better than any bar chart or scatter plot.

Unlike most of the analyses listed above, geospatial visualisations require more specialist software than Excel. However, mapping packages usually only require tables of postcodes as inputs, along with any other measures to be visualised (e.g. attendance or attainment). Colour is probably the easiest way to represent these other measures, though bubble size is also an option — especially if the measure reflects size (e.g. number of students).

As a quick summary, here’s a table outlining the ten analysis types listed above, along with my recommended visualisation approaches.

Clearly there are many more analyses and visualisations not listed here, but I think these ten reflect the vast majority of use-cases I’ve come across while working with schools.

My final recommendation would be to use data visualisation as an analysis tool — not just a presentation format. The approaches outlined above are intended to help our audiences get to insights faster, but the same applies to us as analysts. And the faster we can get to these insights, the more time we will have available to actually act on them. This is especially important in a school environment, where every additional minute is a valuable opportunity to further support our students.

N.B. All of the analyses and visualisations in this blog use dummy data.

--

--

Rich Davies

Director of Insight at Ark Schools (Views are my own)