Public health data at your fingertips

This is the second in a series of posts charting the design choices, open source tools and analytical workflows that the Trafford Data Lab are adopting.

The Trafford Data Lab supports decision-making in Trafford, a local authority in Greater Manchester, by revealing patterns in data through visualisation. It is committed to publishing open data and using open source tools to encourage a transparent and reproducible analytical workflow.


Inspired by the data visualisations in yesterday’s informative Financial Times article (O’Connor 2017) on the relationship between Blackpool’s economy and the health of its residents, I decided to visualise some of the same public health data for Trafford. Using the new fingertipsR package I’ll show you how to download some data from Public Health England’s Fingertips repository and visualise it in R with ggplot2.

Load the necessary R packages

The fingertipsR package allows users to query Public Health England’s Fingertips API. Fingertips is a repository of public health indicators managed by Public Health England.

library(fingertipsR) ; library(tidyverse) ; library(stringr) ; library(ggplot2)

The key function of the fingertipsR package is fingertips_data() which requires an indicator and an area type to be supplied. Before we can provide these values we need to know which indicators and area types are available on Fingertips.

Return a list of indicators from Fingertips

Use the indicators() function to retrieve all of the indicators from Fingertips.

indicators <- indicators()

The resulting data frame contains over 4,500 public health indicators categorised by over 200 domains including ‘Child obesity’, ‘Hospital admissions’, and ‘Suicide data’ and 66 higher level profiles.

A sample of indicators on Fingertips

Filter the indicator name field for “antidepressant”

We are interested in visualising antidepressant prescription rates so need to filter the IndicatorName field.

indicators %>% 
filter(str_detect(IndicatorName, regex('antidepressant', ignore_case = T))) %>%
View()

Note that we use regex and the str_detect function from the stringr package to search for the “antidepressant” string (ignoring case).

The resulting data frame contains 6 indicators with “antidepressant” in the IndicatorName field. Let’s choose the indicator with the IndicatorID of 90527.

Indicators with “antidepressant” string

Identify the area type of interest

The public health data on Fingertips are aggregated at different statistical and administrative geographies. To return a list of all the available geographies you need to run the area_types() function.

area_types() %>%
select(AreaTypeID, AreaTypeName) %>%
distinct() %>%
View()
Some of the area types available on Fingertips

23 different geographies are returned including GPs, Laboratories, Local Authority districts, and Government Office Regions.

A quick check of the Fingertips website indicates that antidepressant prescription rates are available at the level of Clinical Commissioning Groups (CCGs). The table provides a corresponding AreaTypeID of 153.

Extract the data

Armed with the IndicatorID and the AreaTypeID we can now query the Fingertips API using the fingertips_data() function.

query <- fingertips_data(IndicatorID = 90527, AreaTypeID = 153)

Data at CCG level geography for the ‘Antidepressant prescribing: average daily quantities (ADQs) per STAR-PU’ indicator are returned.

Filter the data
The data requires a little bit of cleaning before visualising it. Only rates for CCGs recorded during 2016/17 are retained.

df <- query %>% 
select(IndicatorID, AreaCode, AreaName, AreaType, Timeperiod, Value) %>%
filter(AreaType == "CCGs (pre 4/2017)" & Timeperiod == "2016/17")
First few rows of filtered data frame returned from query

Visualise the data
The antidepressant prescription data can now be visualised as a simple histogram using ggplot2.

ggplot(df, aes(x = Value)) +
geom_histogram(bins = 30) +
labs(x = NULL, y = "Number of CCGs",
title = "Units of antidepressant prescribed per person",
subtitle = "Public Health England, 2016-17",
caption = "Source: Fingertips")

Or using our own ggplot theme:

source("https://trafforddatalab.github.io/assets/theme/ggplot2/theme_lab.R")
df_unit <- filter(df, AreaName == "NHS Trafford CCG")
df_background <- df
ggplot(df_unit, aes(x = Value)) +
geom_histogram(data = df_background,
bins = 30,
fill = "#757575",
alpha = 0.6) +
geom_vline(aes(xintercept = median(df$Value)), colour = "#757575", linetype = "dashed") +
geom_histogram(fill = "#fc6721", colour = "white") +
scale_x_continuous(limits = c(0, 2.3)) +
scale_y_continuous(expand = c(0, 0)) +
labs(x = "Units of antidepressant prescribed per person", y = "Number of CCGs",
title = "Trafford has an average rate of antidepressant prescribing",
subtitle = "Public Health England, 2016-17",
caption = "Source: Fingertips | @traffordDataLab") +
theme_lab() +
theme(panel.grid.major.x = element_blank())

Kudos to the developers of the fingertipsR package and Simon Jackson for the handy plotting background data trick in ggplot2.


Written by Henry Partridge, Manager of the Trafford Data Lab


References

Fox, Sebastian, & Flowers, Julian (2017). fingertipsR: Fingertips Data for Public Health. R package version 0.1.3. https://CRAN.R-project.org/package=fingertipsR

O’Connor, Sarah (2017). “Left behind: can anyone save the towns the economy forgot?”, Financial Times, 16 November 2017. Available via: https://www.ft.com/content/b6dbf34e-c987-11e7-aa33-c63fdc9b8c6c

Wickham, Hadley (2017). tidyverse: Easily Install and Load ‘Tidyverse’ Packages. R package version 1.1.1. https://CRAN.R-project.org/package=tidyverse

Wickham, Hadley (2017). stringr: Simple, Consistent Wrappers for Common String Operations. R package version 1.2.0. https://CRAN.R-project.org/package=stringr

Wickham, Hadley (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York.