How BBC data journalists use R for data visualization
Senior data journalists Clara Guibourg and Nassos Stylianou explain how the adoption of programming language R simplified graphics workflow at BBC News.
At the latest Hacks/Hackers London event, Clara Guibourg and Nassos Stylianou from the BBC Visual and Data Journalism team revealed how their team revamped its graphics workflow over the past year by developing bbplot — a customized version of the ggplot2 package — for R.
The package increased journalists’ productivity by giving more autonomy to the data team and freeing the graphics team from making the same charts over and over. As Guibourg explained: “Within the data team, we were using R for data analysis for quite a long time but, when it came to making charts, we had two options: if it was a quick turn-around thing, we made it ourselves, using the in-house chart tool, if we had more time, we would commission a chart from our designers.”
Last year, the data team started thinking that it was much better for their workflow to do everything in one place and to “go from the analysis step to the publication-ready chart, all within the same tool,” Guibourg said.
In March 2018 the ten members of the data team started experimenting by publishing the first BBC chart made from start to finish in R in March 2018. Since then, the team developed an R open source cookbook — which is a ready-made collection of instructions to configure something that everyone can copy or read to learn by example.
You can watch Guibourg and Stylianou’s full speech on our YouTube channel and read below for our main takeaways.
Why did the BBC choose R?
R is a statistical programming language used by many newsrooms for data analysis. If used in combination with the ggplot2 visualization package, the software can be used to show the distribution of large datasets and to turn it quickly into a chart.
“We found a lot of benefits with R. It gives you more freedom in terms of how you want your chart to look. Because we work with scripts now, everything is much more reproducible,” Guibourg said.
Bbplot makes it possible to export a BBC-styled chart in just a few steps, that doesn’t need any further modification. Furthermore, Guibourg said, working with scripts saves time, especially when updating charts for a story.
How BBC journalists use bbplot and the cookbook
The package was developed to simplify the process to make charts. Therefore, the main function of the tool is to quickly create graphics in the BBC News website’s style. The software uses two functions: bbc_style() and finalise_plot().
The function bbc_style() changes ggplot2’s default appearance into the BBC’s style. It modifies arguments in the theme. As written in the cookbook “the function does not change or adapt based on the type of chart you are making,” so it might need additional, last-minute changes. It defines text size, font size, colour, axis and other main components.
According to the BBC: “The idea was that bbc_style(), the function we created for changing ggplot2’s default appearance to our in-house style, should get you 90% of the way, leaving you in control to make any additional tweaks to your chart, rather than it feeling akin to a chart tool that just presented you with finished graphics and with little room for manoeuvre.”
The second function of the package, finalise_plot(), represents the last step of the process, making final adjustments before exporting and, according to the cookbook, “left-align the title, subtitle and add the footer with a source and an image in the bottom right corner of your plot. It will also save it to your specified location.”
Finally, the cookbook gathers the team’s knowledge. Whenever a member of the team develops a new script, the code goes to the manual and become available to the entire team and, from a couple of months now, to every journalist who wants to experiment with the tool.
How did the BBC manage the transition?
It is not easy to change workflow in a massive work environment like the BBC, but “we essentially managed to succeed,” Nassos Stylianou said.
The data team researched online how other coders had previously solved the same problem: “It wasn’t anything brand new, it was just making things work for us and bringing them together in one place,” Stylianou said.
Key for success was that “the transition wasn’t a responsibility of a single person.” The data team started to use a Slack channel to share bits of code with each other and each member of the data team helped with the function in a collective effort.
“The transition worked because it was a team effort. Each person sort of built on the other person’s work and we really worked together to get there,” Stylianou said.