TL;DR I built the web-site where you can explore the Stack Overflow questions referenced in the source code in Github. Check it out in http://sociting.biz
Motivation
I am a big fan of Google Cloud Platform, especially I love its data warehouse implementation called BigQuery. In summer of 2016 Github and Google made the open-source data available for everyone in BigQuery, here are the mind boggling numbers:
This 3TB+ dataset comprises the largest released source of GitHub activity to date. It contains a full snapshot of the content of more than 2.8 million open source GitHub repositories including more than 145…
‘Member circa 2010 when we used a pretty simple approach to implement the Login function in our web-applications? It was so awesome! HTML Form elements gave us everything we need: attributes to specify the server-side handler, input controls for username, password and “remember me” checkbox, submit button. HTML5 allowed us to ensure that the fields are not empty without relying on JavaScript code. The implementation of a server-side handler of a form submitting depended on the underlying technology but it also was very simple — if a username or password were incorrect, a browser was instructed to paint the login…
The other day I came across a hilarious tweet
It’s funny indeed, but at the same time it’s the typical showcase of Cargo Cult Programming — ritual inclusion of code or program structures that serve no real purpose. It might seem to be an edge case, a funky thing that couldn’t be really met in the real-world, but let’s review a piece of code from the highly rated — more than 8000 Github stars — project called Calypso.
Calypso is the new WordPress.com front-end — a beautiful redesign of the WordPress dashboard using a single-page web application, powered by…
So, you’ve decided to use react.js in your new awesome project and now you’ve got a new problem — which state management approach to choose? Is it going to be Redux, Fluxible or Alt? As you work on a project, a ton of 3rd party tools, libraries and components are right there to help. Public Github data allows to look at the summary stats on this flourishing field of modern react-based applications.
The most common approach to build a web app nowadays is bundling a bunch of npm packages — react, redux, react-redux, etc. to be used in a browser…
The 5th edition of ECMA-262 standard aka ES5 introduced really useful methods of the built-in Array object : forEach, every, some, filter, map, reduce, reduceRight. The only problem is that they may not be present in all implementations of the standard, for example in MS IE8 implementation that is based on the 3rd edition of the ECMA-262. Fortunately we can work around this by inserting the polyfill for these methods in our scripts. It’s all well and good, but a polyfill’s code might look a little bit odd, for example here is the slightly modified MDN implementation of forEach method:
The Internet era changed the way we obtain and process information. The huge volumes of data are free and ubiquitous and the cloud computing has put practically infinite computing power and storage and the sophisticated tools at everyone’s disposal, on a pay-as-you-go basis. This story explains how to leverage the cloud computing power and publicly available data to build a tool for real-time analytics of the presidential debate feedback posted to twitter. The aim of this article is to show how easy it is to implement pretty interesting, helpful(or malicious!) …
Every enthusiastic javascript blogger should write at least one article that compares the front-end frameworks, so do I! But no worries, this article isn’t Nth attempt to describe the pros and cons of the Angular and React, it rather glances at them from an unusual angle by applying the text mining methods to the commit messages that are collected from the version control system and remarking on the results. The research is fully reproducible, you can find the data and Rmd file here. …
My Twitter friend Felipe Hoffa has recently posted the excellent story called 400,000 GitHub repositories, 1 billion files, 14 terabytes of code: Spaces or Tabs? which reveals the stats behind the never-ending-war between two styles of the code formatting. I was inspired by this article and got the idea for the new research : what’s the situation with swearing in the commit messages? Indeed it’s not new at all and there were multiple attempts to do the same :
But the recently exposed BigQuery data provides new opportunities to look at it. Also, as I am doing it…
Earlier today I’ve sent the following tweet:
I’ve stumbled upon “isArray” thing during the exploration of the public GitHub data available in the Google BigQuery platform. This story exposes some new interesting findings I’ve discovered.
The previous stories(1, 2) analyzed the contents of package.json files which are descriptors used for Node.js modules dependency management. Recently, I noticed some odd numbers in the Github data:
Huh? What is going on?
The previous story showcased how the public data and affordable computing power could be leveraged to explore the world of open-source software : the code analyzed the descriptors of the npm registry entities to find the most popular ones. This article uses the same methods to infer the JavaScript development trends from the npm’s public collection of packages of reusable code.
The descriptors of npm packages are kept in files called “package.json”. Among other things, they contain the list of the keywords that “help people discover a package as it’s listed in npm search”, for example:

I code therefore I am.