How to Choose the Best Open Source Module For Your Needs
I really appreciate the current state of open source development. It may not be perfect, but I truly value the free flow of ideas and creativity, and the willingness of so many developers to share what they know and how they solve problems. Ultimately, this is what I think open source should be: solving problems and sharing answers.
However, as with almost everything in life, this coin has another side: Not all answers are equal, and having too many choices can be almost as bad as not having any at all. Once I started writing software that relied on open source modules, I learned this reality pretty quickly. I eventually came up with a set of guidelines to help me make decisions and mitigate some of the inherent risk. Hopefully, you will find these guidelines useful in your searches, too.
These guidelines are not listed in any particular order. I try to consider all of them equally and make my final decision based on an overall score.
Consider the Author(s)
We can’t know all open source authors personally, but there are some indicators that should help us:
- Are they well known within their community?
- How many modules have they published?
- How many stars/forks/contributors/followers do their GitHub repos have?
- What is their overall level of contribution activity?
- Do they respond well to issues and PRs?
- Are the modules updated in step with the language or engine they are based on?
The idea here is to formulate some sense of their overall commitment and interest in open source development, as well as their relationship with other community members, and their awareness of how the ecosystem is evolving around them.
Does this mean that we should only be interested in utilizing code written by the top 1%? Not necessarily. Being popular is not synonymous with being best. However, my experience has shown that reliability and consistency is a very important concern when using pretty much anything developed by someone else.
Consider the Documentation
Documentation is key. Good documentation will significantly sway my opinion about an open source module:
- Does the module have a README?
- Does the documentation include code samples of real-world usage, and do the samples actually work?
- Is there a CHANGELOG? Or are changes adequately described in release notes?
- Can I understand the documentation well enough to make the best use of the module? (Admittedly, this may only become apparent after you start using the module. C’est la vie.)
- If there is a lot of information to convey, is the documentation broken out into separate files or pages? Are those sources understandable?
Some developers may argue that code should be self-documenting or that unit tests should adequately fill the need of showing how code works. While I definitely agree with this viewpoint in theory, in practice I don’t want to spend hours or even days reading code just to figure out how it works. Please provide us with a good reference document and make sure it’s kept up to date. This is part of our job as professional software developers, and all of us should be setting a good example for one another.
Consider the Code
I don’t necessarily take the time to read every line of code in a module that I am considering using in my project. But I do check for the basics:
- Does the module follow a coding standard? A good way to check is if the module includes a
lintcommand, or if the README indicates that a standard is being followed.
I do have my own preference for coding in JS, but I’m not super concerned about which standard is used, as long as a standard is used. This is important because there’s a good chance that I will need to consult a module’s source code at some point in time, and I do not want to deal with code that is poorly formatted.
- Does the module include automated tests? Are those tests automatically run by a CI system, and does the README include a badge indicating the current build/test state?
Cloud-based continuous integration services are so ubiquitous these days that I don’t think there is a reasonable excuse not to use them, even for the smallest of projects.
(There is a side note that I think is important to mention here. Just because a badge indicates that the latest build failed, it doesn’t mean the build actually failed. Sometimes web caching gets in the way of indicating the true status, and so I always click on these badges to check for myself. I’ve also found that the build matrix may include environments or versions that I’m not concerned about, and their failure may or may not be an issue.)
- Does the module use Semantic Versioning, and is the current release at 1.0.0 or higher?
SemVer isn’t always the easiest concept to apply correctly, but I always award points to those who are trying. I also tend to avoid modules that are stuck in “not ready for prime time” mode (meaning, they proudly indicate they are major version 0 and were initially released months or years ago), even if they are at least somewhat popular and fill an important need. To me, this is a big red flag. Remember what I said above about reliability and consistency? I’ve come to the conclusion that using SemVer is a sign of respect and consideration for others. We should stand by our work, and if we’re ready to release something “into the wild,” so to speak, then we should at least provide and communicate some guarantee about stability.
- Does the API generally conform to a style or pattern (functional vs. imperative, fluent, declarative, etc.) that I’m comfortable programming against?
API design is a little tricky to get right, and some APIs definitely make more sense than others. One of the good things about having multiple choices between different modules is that we can often pick one that matches our preferences.
The Bottom Line
I like the idea that some of this work can be automated, but at the end of the day, I’m the one who’s ultimately responsible for making the decision. So I want to be involved in the process. I also don’t like the idea of solely relying on “curated” or “best of” lists, because I’ve often found they either aren’t up to date or aren’t original.
Two final thoughts:
- Yes, this process is a little time-consuming, but I don’t think there’s any way around that. From my perspective, it’s worth spending a few hours (or even days) doing some detective work if the lifespan of my application that depends on these modules is measured in months or years.
- This process does not provide an absolute guarantee of success. There is still some risk involved, but there are additional defensive measures that can be taken, which I will discuss in a future article.