GitHub repositories contain tonnes of package.json files, extracting all dependencies from them with the help of BigQuery brings us a lot of valuable analytic data. Let's explore them.

Data sample

For this analysis, I selected 559 968 repositories containing package.json on GitHub. I wanted to correlate this data somehow with dates, and the way I did it was by extracting the date from the last commit that modified package.json. The next step was to get all dependencies from package.json, which gave me 8.1 million dependencies (containing both dependencies and devDependencies).

dependencies or devDependencies?

If you take a look at npm trends you get download data…

