Public health data at your fingertips
This is the second in a series of posts charting the design choices, open source tools and analytical workflows that the Trafford Data Lab are adopting.
The Trafford Data Lab supports decision-making in Trafford, a local authority in Greater Manchester, by revealing patterns in data through visualisation. It is committed to publishing open data and using open source tools to encourage a transparent and reproducible analytical workflow.
Inspired by the data visualisations in yesterday’s informative Financial Times article (O’Connor 2017) on the relationship between Blackpool’s economy and the health of its residents, I decided to visualise some of the same public health data for Trafford. Using the new fingertipsR package I’ll show you how to download some data from Public Health England’s Fingertips repository and visualise it in R with ggplot2.
Load the necessary R packages
library(fingertipsR) ; library(tidyverse) ; library(stringr) ; library(ggplot2)
The key function of the fingertipsR package is
fingertips_data() which requires an indicator and an area type to be supplied. Before we can provide these values we need to know which indicators and area types are available on Fingertips.
Return a list of indicators from Fingertips
indicators() function to retrieve all of the indicators from Fingertips.
indicators <- indicators()
The resulting data frame contains over 4,500 public health indicators categorised by over 200 domains including ‘Child obesity’, ‘Hospital admissions’, and ‘Suicide data’ and 66 higher level profiles.
Filter the indicator name field for “antidepressant”
We are interested in visualising antidepressant prescription rates so need to filter the
filter(str_detect(IndicatorName, regex('antidepressant', ignore_case = T))) %>%
Note that we use regex and the
str_detect function from the stringr package to search for the “antidepressant” string (ignoring case).
The resulting data frame contains 6 indicators with “antidepressant” in the
IndicatorName field. Let’s choose the indicator with the
IndicatorID of 90527.
Identify the area type of interest
The public health data on Fingertips are aggregated at different statistical and administrative geographies. To return a list of all the available geographies you need to run the
select(AreaTypeID, AreaTypeName) %>%
23 different geographies are returned including GPs, Laboratories, Local Authority districts, and Government Office Regions.
A quick check of the Fingertips website indicates that antidepressant prescription rates are available at the level of Clinical Commissioning Groups (CCGs). The table provides a corresponding
AreaTypeID of 153.
Extract the data
Armed with the
IndicatorID and the
AreaTypeID we can now query the Fingertips API using the
query <- fingertips_data(IndicatorID = 90527, AreaTypeID = 153)
Data at CCG level geography for the ‘Antidepressant prescribing: average daily quantities (ADQs) per STAR-PU’ indicator are returned.
Filter the data
The data requires a little bit of cleaning before visualising it. Only rates for CCGs recorded during 2016/17 are retained.
df <- query %>%
select(IndicatorID, AreaCode, AreaName, AreaType, Timeperiod, Value) %>%
filter(AreaType == "CCGs (pre 4/2017)" & Timeperiod == "2016/17")
Visualise the data
The antidepressant prescription data can now be visualised as a simple histogram using ggplot2.
ggplot(df, aes(x = Value)) +
geom_histogram(bins = 30) +
labs(x = NULL, y = "Number of CCGs",
title = "Units of antidepressant prescribed per person",
subtitle = "Public Health England, 2016-17",
caption = "Source: Fingertips")
Or using our own ggplot theme:
df_unit <- filter(df, AreaName == "NHS Trafford CCG")
df_background <- df
ggplot(df_unit, aes(x = Value)) +
geom_histogram(data = df_background,
bins = 30,
fill = "#757575",
alpha = 0.6) +
geom_vline(aes(xintercept = median(df$Value)), colour = "#757575", linetype = "dashed") +
geom_histogram(fill = "#fc6721", colour = "white") +
scale_x_continuous(limits = c(0, 2.3)) +
scale_y_continuous(expand = c(0, 0)) +
labs(x = "Units of antidepressant prescribed per person", y = "Number of CCGs",
title = "Trafford has an average rate of antidepressant prescribing",
subtitle = "Public Health England, 2016-17",
caption = "Source: Fingertips | @traffordDataLab") +
theme(panel.grid.major.x = element_blank())
Kudos to the developers of the fingertipsR package and Simon Jackson for the handy plotting background data trick in ggplot2.
Written by Henry Partridge, Manager of the Trafford Data Lab
Fox, Sebastian, & Flowers, Julian (2017). fingertipsR: Fingertips Data for Public Health. R package version 0.1.3. https://CRAN.R-project.org/package=fingertipsR
O’Connor, Sarah (2017). “Left behind: can anyone save the towns the economy forgot?”, Financial Times, 16 November 2017. Available via: https://www.ft.com/content/b6dbf34e-c987-11e7-aa33-c63fdc9b8c6c
Wickham, Hadley (2017). tidyverse: Easily Install and Load ‘Tidyverse’ Packages. R package version 1.1.1. https://CRAN.R-project.org/package=tidyverse
Wickham, Hadley (2017). stringr: Simple, Consistent Wrappers for Common String Operations. R package version 1.2.0. https://CRAN.R-project.org/package=stringr
Wickham, Hadley (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York.