The tools of the trade in computational journalism
Programming languages, web apps, libraries and frameworks are among the 50 or so different tools used or expected to be used by media organizations wishing to delve into data-driven or computational reporting.
The diversity of the tools used by computational journalists in 2015 is astounding.
This is one of the preliminary conclusions of a research project on computational and data-driven journalism in the Canadian province of Quebec.
I have interviewed 30 participants for this project. Each was asked to give an idea of their workflow, to provide detailed examples of how they gathered data, how they processed it and how they presented/visualized it. They were asked to describe the tools they used or that they wished they could use. A little more than 50 different software applications, programming languages, libraries, frameworks or web apps were mentioned.
The chart below thus presents the extent of what is required or believed to be required to perform computational journalism in 2015.
The vertical axis shows the relative complexity of each tool, those at the bottom being the easiest to master, those at the top the most difficult.
The horizontal axis shows the tools relative to the moment they might be used in what one participant called the “computational journalism pipeline” (gathering => processing => presenting/displaying/visualizing).
The chart is not perfect. Some tools have been mentioned, but are not actually used, like Hadoop, which is overkill for most data-driven journalism projects. Some tools have been hard to place, like Github, which has been put in the “Presenting” area because it is mainly used by journalists to showcase their work.
The 30 interviews for this project have been conducted during the months of May, June and July 2015. Of those participants, 18 are practitioners, meaning they are reporters, researchers or assignment editors who are or have been either working or freelancing for media organizations in Quebec. Six are editors for the most important newsrooms in the province. The remaining 6 are developers or data scientists who are not employed by media organizations, but who have shown interest in data journalism either by participating in meetups or hackathons involving journalism or by helping journalists on specific projects.
I also did a French and Spanish version of the chart.
J’ai également produit une version française qui présente ce qui peut être utile en journalisme informatique en 2015:
También hizo una versión en español de las herramientas del periodismo de datos o computacional: