Public health data at your fingertips
This is the second in a series of posts charting the design choices, open source tools and analytical workflows that the Trafford Data Lab are adopting.
The Trafford Data Lab supports decision-making in Trafford, a local authority in Greater Manchester, by revealing patterns in data through visualisation. It is committed to publishing open data and using open source tools to encourage a transparent and reproducible analytical workflow.
Inspired by the data visualisations in yesterday’s informative Financial Times article (O’Connor 2017) on the relationship between Blackpool’s economy and the health of its residents, I decided to visualise some of the same public health data for Trafford. Using the new fingertipsR package I’ll show you how to download some data from Public Health England’s Fingertips repository and visualise it in R with ggplot2.
Load the necessary R packages
The fingertipsR package allows users to query Public Health England’s Fingertips API. Fingertips is a repository of public health indicators managed by Public Health England.
library(fingertipsR) ; library(tidyverse) ; library(stringr) ; library(ggplot2)
The key function of the fingertipsR package is fingertips_data()
which requires an indicator and an area type to be supplied. Before we can provide these values we need to know which indicators and area types are available on Fingertips.
Return a list of indicators from Fingertips
Use the indicators()
function to retrieve all of the indicators from Fingertips.
indicators <- indicators()
The resulting data frame contains over 4,500 public health indicators categorised by over 200 domains including ‘Child obesity’, ‘Hospital admissions’, and ‘Suicide data’ and 66 higher level profiles.
Filter the indicator name field for “antidepressant”
We are interested in visualising antidepressant prescription rates so need to filter the IndicatorName
field.
indicators %>%
filter(str_detect(IndicatorName, regex('antidepressant', ignore_case = T))) %>%
View()
Note that we use regex and the str_detect
function from the stringr package to search for the “antidepressant” string (ignoring case).
The resulting data frame contains 6 indicators with “antidepressant” in the IndicatorName
field. Let’s choose the indicator with the IndicatorID
of 90527.
Identify the area type of interest
The public health data on Fingertips are aggregated at different statistical and administrative geographies. To return a list of all the available geographies you need to run the area_types()
function.
area_types() %>%
select(AreaTypeID, AreaTypeName) %>%
distinct() %>%
View()
23 different geographies are returned including GPs, Laboratories, Local Authority districts, and Government Office Regions.
A quick check of the Fingertips website indicates that antidepressant prescription rates are available at the level of Clinical Commissioning Groups (CCGs). The table provides a corresponding AreaTypeID
of 153.
Extract the data
Armed with the IndicatorID
and the AreaTypeID
we can now query the Fingertips API using the fingertips_data()
function.
query <- fingertips_data(IndicatorID = 90527, AreaTypeID = 153)
Data at CCG level geography for the ‘Antidepressant prescribing: average daily quantities (ADQs) per STAR-PU’ indicator are returned.
Filter the data
The data requires a little bit of cleaning before visualising it. Only rates for CCGs recorded during 2016/17 are retained.
df <- query %>%
select(IndicatorID, AreaCode, AreaName, AreaType, Timeperiod, Value) %>%
filter(AreaType == "CCGs (pre 4/2017)" & Timeperiod == "2016/17")
Visualise the data
The antidepressant prescription data can now be visualised as a simple histogram using ggplot2.
ggplot(df, aes(x = Value)) +
geom_histogram(bins = 30) +
labs(x = NULL, y = "Number of CCGs",
title = "Units of antidepressant prescribed per person",
subtitle = "Public Health England, 2016-17",
caption = "Source: Fingertips")
Or using our own ggplot theme:
source("https://trafforddatalab.github.io/assets/theme/ggplot2/theme_lab.R")df_unit <- filter(df, AreaName == "NHS Trafford CCG")
df_background <- dfggplot(df_unit, aes(x = Value)) +
geom_histogram(data = df_background,
bins = 30,
fill = "#757575",
alpha = 0.6) +
geom_vline(aes(xintercept = median(df$Value)), colour = "#757575", linetype = "dashed") +
geom_histogram(fill = "#fc6721", colour = "white") +
scale_x_continuous(limits = c(0, 2.3)) +
scale_y_continuous(expand = c(0, 0)) +
labs(x = "Units of antidepressant prescribed per person", y = "Number of CCGs",
title = "Trafford has an average rate of antidepressant prescribing",
subtitle = "Public Health England, 2016-17",
caption = "Source: Fingertips | @traffordDataLab") +
theme_lab() +
theme(panel.grid.major.x = element_blank())
Kudos to the developers of the fingertipsR package and Simon Jackson for the handy plotting background data trick in ggplot2.
Written by Henry Partridge, Manager of the Trafford Data Lab
References
Fox, Sebastian, & Flowers, Julian (2017). fingertipsR: Fingertips Data for Public Health. R package version 0.1.3. https://CRAN.R-project.org/package=fingertipsR
O’Connor, Sarah (2017). “Left behind: can anyone save the towns the economy forgot?”, Financial Times, 16 November 2017. Available via: https://www.ft.com/content/b6dbf34e-c987-11e7-aa33-c63fdc9b8c6c
Wickham, Hadley (2017). tidyverse: Easily Install and Load ‘Tidyverse’ Packages. R package version 1.1.1. https://CRAN.R-project.org/package=tidyverse
Wickham, Hadley (2017). stringr: Simple, Consistent Wrappers for Common String Operations. R package version 1.2.0. https://CRAN.R-project.org/package=stringr
Wickham, Hadley (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York.