Best Practices While Creating Reusable Software Packages

Photo by Claudio Schwarz | @purzlbaum on Unsplash

One of the best ways to share code in Python is by creating packages (or libraries/libs). A package is nothing but a set of structured software components (functions, classes, modules). Two characteristics are desirable in a package: solving a specific problem and being reusable. Once you have new features that could be available for several Software projects, these are the two fundamentals that will help you to choose the better path:

  1. Should I create a package to solve this problem?
  2. Should I add new features to an existing package?
  3. Should I just duplicate the needed code?

When to create a package?

When certain functionality *deserves* a package (and not just another class), it is usually related to a non-functional problem. Good examples of packaged features are:

  1. Access to APIs
  2. Data manipulation
  3. Application of computational methods
  4. Extend features that the language does not have (probably there is already a public lib for this)

It means that just being a reusable solution is not reason enough to create a new package. External components come with lots of extra heavy-lifting: Testing, maintenance, documentation, and slower builds. Libs that solve a specific problem and address multiple use cases are more widespread. Features related to a specific domain usually end up being coupled in the project, and even if they are refactored in a lib they will not be used by others. This can lead to a large decoupling effort without added value in reuse.

Towards a stable API

After you are sure there’s a need for a new package, some very important architectural decisions must be made. Libraries must offer a stable, versioned API, allowing features to be used in a certain manner for outside users. The hardest part is to create an interface that will remain stable while new features are added, achieving the so-called backward compatibility.

You will probably adhere to semantic versioning as a direct way of communicating to your users what kind of changes are expected in every release you do. That’s great, but as soon as many projects rely upon your contract, something odd will happen:

Workflow by xkcd

Known as Hyrum’s Law, this effect means that people will depend on every single observable behavior of your package. No matter whether it was implemented on purpose or not, you will never be able to fully isolate what should be used as a standard feature and what is a side effect. What you can do is mitigate this effect with some of the best practices shown here.

Best practices on software packaging

  • Parameterizable Configuration. Imagine that each project uses different forms of configuration in each environment: Vault, Docker Args, environment variables, Credentials on Jenkins, Secrets on GitHub, Secrets Kubernetes, etc. This means that any configuration that the lib needs (for example, AWS credentials or bank access parameters) have to be flexible. Think that in the future the package can be made available to the general public, and for that, they must be completely isolated from the current domain
  • Have a good README. includes the basic project description, installation step by step, and basic usage examples. Additionally, write topics to help anyone who wants to contribute: how to set up the development environment and open PRs.
  • Document the public API. Ideally, the entire public API (public methods) should be documented. This includes a description of the parameters, features, and specific usage examples. It is a great place to describe special (or counterintuitive) behaviors.
  • Consider different use cases. In the original case (which led to the package being written), it may be that event X always occurs before Y. This may not be true in all cases. So, leave the methods decoupled, following the Principle of Single Responsibility. Even if it means to write a few more lines to use the package, go allow it to serve an even larger audience. Make a mental exercise on how certain workflows or features will be used by outsiders.
  • Write tests. Although it is a good practice for any project, tests on the packages increase the stability of the interface and the confidence of future users. Well-tested packages can have deeper dependencies on more projects with less chance of bugs or contract breaches.
  • [Python] Use Type Hints. Python’s optional typing system acts as living documentation of component properties, parameters, and returns. This means that IDEs and static code checks can point out bugs and code smells hidden in the flexible nature of the language. Especially in packaged features, where the interface is more important than the format, typing makes it easier to use and adopt the standards provided to the user. This tip also applies to other languages (with strict or loose type system) where those indications should be explicit and complete, avoiding generic types or useless limitations.
  • Define Code Standards. Code standards (nomenclature, formatting, linting) facilitate development and code review, as they remove the cognitive burden of analyzing details that for the computer is a trivial task and can be automated in CI. This lowers the entry barrier for new contributions, which are one of the main objectives when creating a package. You can always use a well-known standard depending on the language you are working with.
  • Worry about security. One of the reasons for using a package is not having to worry about the implementation details. This means that security problems must be addressed by the library’s authors, since the more the package is used, the greater the impact of possible vulnerabilities.

Conclusion

Sharing code through packages will be always a complex task, and all we can do is expose a stable, well-documented, and useful interface so users can be taught how to effectively use the provided features. Never believe versioning will save you. Invest, however, putting of fundamental concepts of Software Engineering to work for you: Cohesion, Decoupling, Single Responsibility Principle, and many others. Useful, well-tested, and documented packages are certainly more prone to add value to complex systems.

--

--