I applied for the JSK Fellowship with the following proposal: How might we grow an open-source ecosystem of tools to help data journalists collect, analyze and publish the data underlying their stories?
My starting point for this question was my Datasette open-source project. Datasette is a tool for exploring and publishing data. It provides an interface for exploring small or large datasets, an API for integrating that data into custom applications and a collection of tools for publishing that data to the internet.
I designed Datasette based on my experience working with the Guardian Datablog team at the Guardian from…
Two interesting data sources have emerged in the past few weeks concerning the Russian impact on the 2016 US elections.
Separately, the House Intelligence Committee Minority released 3,517 Facebook ads that were reported to have been bought by the Russian Internet Research Agency as a set of redacted PDF files.
The initial data was released as zip files full of PDFs, one of the least friendly formats you can use to publish data.
Ed Summers took…
Keeping documentation synchronized with an evolving codebase is difficult. Without extreme discipline, it’s easy for documentation to get out-of-date as new features are added.
One thing that can help is keeping the documentation for a project in the same repository as the code itself. This allows you to construct the ideal commit: one that includes the code change, the updated unit tests AND the accompanying documentation all in the same unit of work.