Open Source Words — Part 2

Word cloud by total frequency, generated from READMEs of the Top 2000 repositories on GitHub

There are many words used to describe software: framework, progressive, modular, reusable, extensible, scalable, simple, ad infinitum. Having worked with, and written, a number of web and mobile projects, I was curious: what words make up the collective developer lexicon, and what were they in response to?

To address this, I wrote some Python code to scrape and analyze the READMEs of the Top 2000 highest-starred GitHub repositories. I expected buzzwords to abound, but found that the best READMEs are written with their audience, other developers, in mind. Many of the most frequent words focus on installation, modification, and documentation: how to get up and running quickly and efficiently.


Background

TL;DR

  • Hypothesis: Libraries got big so developers created alternatives
  • READMEs are an important tool in project discoverability
  • Developers want frameworks that are easy to use and remix

jQuery

Perhaps the trends in open source software engineering proceed as a grand Hegelian dialectic, through a back-and-forth of thesis, anthesis, and synthesis. As the frameworks we use became larger and more monolithic, the natural response by some was simplify to create smaller, more modular frameworks. Take the example of jQuery:

As time progressed, jQuery grew larger and simultaneously less necessary (though thanks to compression technology, the actual bytes sent down the wire grew at a much slower rate). The response?

Some developers believe that jQuery is protecting us from a great demon of browser incompatibility when, in truth, post-IE8, browsers are pretty easy to deal with on their own.–You Might Not Need jQuery

jQuery was created in 2006, and 12 years later it’s still alive! It’s undoubtedly the quintessential javascript library, and for its first few years was nothing short of essential. But as ECMA standards were adopted and web applications become more complicated, it slowly proved to be a burden.

The main criticisms were that developers only use a small subset of functions, needlessly wasting bytes of unused features, that the compatibility it abstracted is no longer a concern for most projects, and that the tools it provided were orthogonal to the challenges engineers face today. Enter React.

React

As web apps become more complex, a new problem emerged. From The deepest reason why modern JavaScript frameworks exist:

The main problem modern JavaScript frameworks solve is keeping the UI in sync with the state.

More than a decade after jQuery’s launch, we’re in the same place solving a different problem. Whether it’s React’s virtual DOM and diff-ing algorithms, or Vue.js’ two-way data binding, all frameworks are trying to address this issue of keeping what you see in sync with the data.

It seems today we’re facing the same challenges. Frameworks solve generic problems, which always entails a tradeoff between functionality and size. The more generic the framework, and the more features it includes, the larger it becomes. Take React for example:

React file size by version

Looks similar to the trajectory of jQuery. The response?

One of React’s biggest selling points was that it was much smaller than Angular/Ember/Backbone etc, but I’m not sure that argument can be leveraged anymore.–coltonv

So here we are, the proverbial debate rages on between size and functionality. Except today, we have another problem: the paradox of choice. Where does one even begin when picking a JavaScript Framework?

JavaScript frameworks in 2018

Frank Chimero wrote a fantastic article, Everything Easy Is Hard Again, expressing his frustration over the rise in complexity when building modern website and applications. His sentiment is shared by many.

simply npm your webpack via grunt with vue babel or bower to react asdfjkl;lkdhgxdlciuhw

That brings us back to words.


Words

I expected to see words like lightweight and progressive among the most frequent, but this is what I found instead:

Top 10 adjectives/ adverbs from READMEs, by unique and total frequency
Word cloud by unique frequency, generated from READMEs of the Top 2000 repositories on GitHub

Like most people, developers are looking for something that is easy, simple, and automatic. Part of the success of any open source project is discoverability. The most functional projects still need to solve a problem for its users, and still need contributors to maintain and improve it. The words extracted from GitHub READMEs align well with advice articles like A Beginners Guide to writing a Kickass README. Any great project needs:

  • Screenshots. Most visitors browse and leave within seconds
  • Demos & examples. Show visitors the real project
  • Documentation. Tell visitors how to easily install and adapt

Words like automatically, available, supported, installed, and easy paint a picture of how the top projects draw in new users.


For more on the code & process, see Open Source Words — Part1

Code and data are available on GitHub: Tombarr/open-source-words