Automating our coronavirus coverage

Ryan Watts
Digital Times
5 min readApr 28, 2020

--

By Ryan Watts and Sam Joiner,
Times and Sunday Times Data and Digital Storytelling team

Covid-19 did not shoot to the top of the news agenda overnight and our first charts and maps on the outbreak appeared in our digital editions in January.

As the virus spread and a national disaster became a global pandemic, presenting the latest information to our readers became an important service. Almost overnight, we had a whole host of visuals that needed daily updates.

Principally there were three processes taking up our time: adding the latest data to our charts, maps and tables; recreating these for the print edition and updating the words in our global explainer and UK tracker.

The unprecedented nature of the outbreak meant this was undoubtedly a good use of resources, but with updates required at regular intervals, live feeds of data from reputable sources and a powerful charting tool at our disposal could much of it be automated?

The answer was yes, and this is how we did it.

Automating our charts

Using R, a programming language for working with data, we wrote a script that pulls statistics from the Johns Hopkins dashboard and used Windows Task Scheduler to set it to publish every hour.

There is an R package which works with the API of Datawrapper, our chart building tool, which allows the user to update a chart’s data, add a timestamp as a note — e.g. last updated at 3pm, March 19 — and republish all by running the same script.

Our UK coronavirus tracker also includes two animated maps built in D3 that show how the number of cases has grown at local authority level across the UK.

The data that informs them is generated in the same way: an R script pulls the latest update, formats it and joins latitudes and longitudes. It is then pushed to an S3 bucket on AWS from R.

An example of how we update a Datawrapper chart and send updated data to S3 with R

The data we’ve used for maps and charts of the UK comes from Public Health England. New data is usually published between 4 and 7pm, but the time has varied considerably and been as late as 10pm.

To save us from the joys of constantly refreshing the page we wrote a script that downloads the daily summary data, checks that the date in the file matches today’s and, if it does, sends a notification in Slack to tell us new data is available.

This script checks whether PHE have updated their dashboard every two minutes

Automating the words

Once the data comes through from Public Health England we use it to update the copy in our UK tracker, using R to generate the figures in the article at the touch of a button.

The process runs a script which updates every variable within the page and then stitches it together before exporting a .txt file.

We then share that file with our online news team who can confidently make updates to the page whenever they are needed.

How we update the first paragraph of our coronavirus tracker page with R

Automating graphics for the paper

A map that works online might need to be rethought for print given space and design requirements, so we set up a separate script that generates two more maps for the graphics team:

  1. A world choropleth map showing cases per million people
  2. A proportional circle or ridge map of cases in the UK

The aforementioned R script pushes the data to Datawrapper, allowing our graphics desk to download a PDF which can be uploaded to Adobe Illustrator and prepared for the page.

The ridge map was too complex for Datawrapper meaning our graphics desk is unable to download and reuse the map in the same way.

To solve this, we built a page to host the ridge map which our graphics team now use to download the most recent version.

The raw data behind the map is published in the same way allowing graphics to use the totals for regions when needed.

This page automatically updates when we change the version of the map used on the website.

The barebones page we use to update the graphics desk

What have we learned?

You still need humans

No matter how much a process can be automated you still need someone to make decisions and review what is being produced.

For our coronavirus coverage, this means feeds are checked for errors on a daily basis and editors continue to do a manual review of all the figures before anything is published.

This goes for the graphics as well. For our UK map, for example, the volume of cases meant the circles on our symbol map were beginning to overlap and individual areas were becoming difficult to see.

Reviewing it as a team we decided to switch to a ridge map, which clearly shows the peaks and has the added benefit of allowing us to label key areas.

But overall, automating the process of updating both pages has saved countless hours and allowed the team to focus on other ways we can report on both the coronavirus outbreak and stories unrelated to the pandemic.

Stick to the script

Harmless and well-intentioned amendments to the data — like changing the UK to the United Kingdom or the clocks going forward — can throw a spanner in the works.

A little bit of error-handling helps, but it is still important to check formatting and feeds for errors on a regular basis.

Keeping everything together saves on duplication

In the rush to tell a story lots of people often have the same idea. That can also be the case within the same newsroom and avoiding the commissioning of charts which already exist is a daily challenge.

Our Datawrapper Slack integration — which automatically posts published charts into a channel — plays an important role here, but talking to colleagues is key when working remotely.

On a much more manual level, a public shout of “have any of you built this already?” can save a lot of time.

A Datawrapper webhook sends a message to a Slack channel every time a chart is published

We’re updating our system as requirements change and the story shifts and we’d love to hear from you if you think you can make our system more efficient. Send me a message on Twitter: @ryanleewatts

--

--

Ryan Watts
Digital Times

Interactive journalist at The Times and Sunday Times