A graphics reporting project: Why R was my saviour and Excel was not
When working at the graphics desk at Dutch newspaper De Volkskrant, I was tasked to analyse all kinds of trends around one theme: How did the U.S. change after the first year of president Donald J. Trump.
On top of covering the ‘usual subjects’ like approval ratings, unemployment rates, GDP growth, approval rating, the world’s trust in U.S. president and the arrests of illegal immigrants, I did a Trump Tweet analysis. I wondered, when did he tweet as a president, as a candidate and as a businessman before his political career. I also analysed his most used word pairs as president.
My workflow was the following: In the data analysis language R I created an app or dashboard if you will (I link to this in a minute).
This single script:
- Scraped data from all kinds of sources
- Cleaned and tidied the data from these sources
- Created data visualisations. First for explorative analysis, later for communication
- Had as an output a Rmarkdown HTML document, for communication to my editors, but also for the sake of reproducibility
Why R was my saviour and Excel was not
Why in one R script? Well, I started this project early on in November 2017. On January the 21st, 2018 the newspaper would publish a ‘Trump special’ with my charts alongside different articles. See the problem already when you would try to do the same project in Excel?
In the beginning of my still young career (almost a year in) as a data journalist, I chose to learn R instead of Excel for data analysis. With this project, I couldn’t be more happy with that choice.
Three things why R was better suited as a tool in this project than Excel:
- You can’t scrape in Excel
- R is (in my humble opinion) so much better in producing explorable/communicative/publishable visualisations effortlessly
- R is communicative and reproducible through Rmarkdown and HTML reports
But the best thing was… in the week before the publication date, the only thing I had to do, was to run and ‘knit’ the script with the shortkey cmd+shift+K. The script and therefor all data, every chart, visualisation and table was updated.
I sent all the charts as PDF’s to my graphics editor, and he finalised the visualisations for the newspaper in his tool of choice: illustrator.
With Excel you probably would have to to do scraping in another tool with the new data, import it, clean it again etc.
I’ll always be learning and adjusting my workflow, but for hopefully some time and for me, R is THE tool for modern data journalism/graphics reporting. It does everything. Acquiring, cleaning, transforming, visualising, modelling data and communicating data in one language and for now one environment: Rstudio.
I’ll keep on making Rmarkdown reports for data analysis the next thing for me to explore will be Shiny, an R package that makes it easy to build interactive web apps straight from R. This way I can communicate findings to my editors and colleagues even better.
Before I send you to the link of my Trump dashboard with code and data visualisations, I want to drag you to some disclaimers. The report is actually quite rough and not fit for publication, I have to say it’s Dutch (the code and code comments are not though) and not edited. There are also some experimental data visualisations and more subjects that didn’t make the press. Keep in mind that it’s also not designed responsively for mobile and it takes a few seconds for the code snippets to fold.
That said, I find it essential that I share my learnings because I learn so much from others too. Enough with the disclaimer talk, here’s the Trump dashboard: Hoe de VS veranderde na een jaar president Trump.
Click the banner below ⬇️ for my data visualisation redesign series and more data related stories.