Sergey Abakumoff
2 min readAug 12, 2016

The previous story showcased how the public data and affordable computing power could be leveraged to explore the world of open-source software : the code analyzed the descriptors of the npm registry entities to find the most popular ones. This article uses the same methods to infer the JavaScript development trends from the npm’s public collection of packages of reusable code.

The descriptors of npm packages are kept in files called “package.json”. Among other things, they contain the list of the keywords that “help people discover a package as it’s listed in npm search”, for example:

Example of package.json

So, let’s find out the trending keywords, shall we? The previous story explained how to select the contents of package.json files from the BigQuery Github data, the results were saved to “githubdataqueries:NpmStat.package_json_content” data table.

Before plunging into the query composition and results, let’s look at the amazing UDF feature of the Google BigQuery that allows developers to run a custom JavaScript code within the SQL query. A user-defined-code accepts the single data row as input and produces zero or more rows as output. Selecting the keywords from the content of a package.json file begs to be implemented in JavaScript because JSON is a subset of JavaScript and it can be used in the language naturally! Here is the self-explanatory code of the user-defined-function that emits the list of the keywords of a package following by the SQL query that uses the output of that function to rank the keywords:

Query to Select Top Keywords

Can you guess trending keywords?

Top 100 keywords

Random thoughts on these results:

  1. Keywords relevant to text processing : “string”, “text” and “parser” are in top 10. I can guess that the root cause of it is the JavaScript built-in String object does not expose functionality that developers use on a daily basis. Fortunately npm registry has a ton of LeftPad(wink-wink) and other useful string utils!
  2. “cli” that stands for “command line interface” is in the 2nd place, closely-related “terminal” and “console” are in 9th and 11th respectively. A possible explanation is:
  • CLI packages only make sense for node.js applications
  • the initial node.js release supported only Linux
  • CLI is at the heart of Linux & Unix systems

3. “http” and “browser” are in top 10. That’s simple : it’s all about web nowadays!