Package Overload: How to Optimise Third-Party Packages for a Leaner, Meaner Project

Tom Noble
Deloitte UK Engineering Blog
8 min readMar 4, 2024

As an engineer, like you, I depend on third-party packages to speed up feature delivery. Packages allow our teams to deliver awesome features and reduce the need to reinvent the wheel.

Packages: a way of organising and encapsulating code and related resources in a modular unit.

At the start of our engineering journey, we develop a fondness of package ecosystems. Commonplace in our developer toolset, we find package managers such as npm, nuget and maven (to name but a few). The inclusion of a package forms a dependency, and those packages may also reference other packages — deepening the dependency chain. As we progress through our careers, we encounter some of the pains of package proliferation.

In this article, I share some of my top tips and considerations to help engineers make informed decisions about the best use of packages.

About me

I started my career in software engineering in 2007. I’ve had the privilege of working across different sectors and many projects. I currently lead an incredible cross functional team, Deloitte TechWorks with Cloud and Engineering.

Dependencies are one of the fundamental building blocks of software development. From the early days, developers would write modules and reference those within code. Around the 2000’s, developers became more protective and started committing binaries (compiled executable files), to source control alongside code. During this time, the centralised version control system, subversion was widely used; the first release of Git was in 2005 and Microsoft’s Team Foundation Server (TFS) in 2006. The practice of version controlling dependencies was wasteful, it meant cloning all versions of the dependencies in a “clone all or nothing” strategy, leading to inordinate clone times.

From this point we matured our approach to dependency management. We began investing in ecosystems and package managers, to handle the download, installation, and update of dependencies — enabling developers to leverage tried and tested capabilities with minimum effort. Regrettably, despite significant progress, some of the core considerations that were present in my early career, remain today.

Recent advancements have enabled us, as engineers, to better understand the breadth and depth of our dependencies. GitHub’s Dependency Graph, for example, allows for the visualisation of dependencies across a wide range of popular package ecosystems (GitHub Docs, 2024). Repository contributors have the ability to improve the accuracy of the dependency graph through the maintenance of lock files and/or the Dependency submissions API (in Beta at the time of writing).

Improving Interoperability

In the simplest sense, inclusion of a dependency should have limited impact on the wider system. Yet, like end products, dependencies themselves are seldom free from dependencies. The web of dependencies and versions often leads to interoperability challenges.

Interoperability: the degree to which two products, programs, etc. can be used together, or the quality of being able to be used together.

Cambridge Dictionary, 2023

Each package manager operates in a different way. Many have a shared version model where only one version of a dependency (often the latest) may exist. Conversely, node package manager (npm) allows many versions through unique dependency trees. The most basic of interoperability challenges, conflicting versions, is avoided through dependency isolation and versioning. This is not a silver bullet; cross package communication remains affected if different versions exist.

As an example, it is not uncommon for a developer to experience the frustration of version compatibility when following documentation, tutorials, or courses. Only to reach the pinnacle moment to then encounter compatibility issues related to libraries and dependencies. Frustrating and fruitless attempts at multiple fixes found on community sites can leave individuals bewildered; this sharp reminder of interoperability troubleshooting can take up a significant amount of a developer’s time.

Diving a little deeper into versioning, an important part of a developer’s learning is the understanding of versioning systems. A common versioning system is Semantic Versioning, where a package (module of code) adopts an incrementing numerical nomenclature. The structure of the version number is MAJOR.MINOR.PATCH, consumers of dependencies using such a format can infer implications based on the increment.

· MAJOR = Breaking change

· MINOR = New functionality, should be backward compatible

· PATCH = Bugfixes, usually low risk upgrades

A developer, subject to the constraints of their package manager, can choose to accept package upgrades based on certain criteria. For example, an npm package (managed through a package.json file), could take the form of 1.0.x, meaning the latest PATCH version update will be automatically consumed on the latest npm update. Taking this one step further, developers can employ numerous other version management techniques from immutable versions to package lock files.

Note: A package lock file describes the packages, and the associated versions, referenced by a solution.

Moving beyond version conflicts, some packages make assumptions on approach (i.e., are opinionated). This design is often present in design choices or implementation style. Whilst opinionated software is not necessarily bad, it can impact interoperability. A developer does not have to stray far before finding a framework that assumes a particular design or approach, later requiring thought and consideration on interoperability with the wider system. Without straying into the debate of front-end frameworks, a typical example of such would be Angular due to its tight coupling to Typescript and predefined application structures.

Desktop/mobile applications are prone to interoperability issues. Within a single device there are many components in play; the operating system, software development kit (SDK), applications and the hardware itself. Over time, each component becomes more feature rich, as do the applications built upon them. This leads to greater issues when developing features whilst trying to maintain support for a wide user base. When including packages, developers may have assumed certain criteria that differ from the intent of your application, leading to challenges with support and feature availability.

Development methodologies and tooling have been forced to adapt to account for interoperability challenges. The rise of virtual environments (e.g. pipenv), allowing developers to bundle runtimes and packages into a contained environment to ease compatibility and development. This leads to the subsequent bundling of applications through containerisation, the practice of copying an entire application into a single deployable container (image).

Key takeaways

  • Leverage semantic versioning to maintain interoperability.
  • Embrace virtual environments/containerisation.
  • Select packages based on solution requirements.

Enhancing Security

The breadth of security considerations are profound. Not least in the domain of third-party dependency risk management. As we include packages and their dependencies in our solutions, we increase our risk as we inherit the risks of the dependency.

Example:

Application — depends on → Package A — depends on → Package B

Package A and B have no known vulnerabilities and are well known open-source components. Package B is known as a transitive (indirect) dependency.

Since publishing, Package B has a vulnerability disclosure raised in the CVE database. Without proper monitoring, reporting and remediation, our solution is vulnerable. The reality of this is high, one recent example being the Log4J vulnerability.

Log4j is used worldwide across software applications and online services, and the vulnerability requires very little expertise to exploit. This makes Log4shell potentially the most severe computer vulnerability in years.

NCSC, 2021

The management of packages, and their sources, requires ongoing monitoring and reporting. Certifications from industry bodies such as ISC2 exist on this very topic. A healthy mix of CI/CD tooling and Cyber Security professionals support this process. Don’t fall foul of A06:2021 from the OWASP Top 10 — “Vulnerable and Outdated Components”.

DO YOU HAVE A PROCESS FOR CONTINUALLY MONITORING YOUR SOFTWARE SUPPLY CHAIN?

An additional concern is whether the sovereignty and licensing of packages influence the rights related to your application. Packages are bound by licensing terms, defining requirements against the product. The Open Source Initiative has a list of common license types. Additionally, services exist to explain the implications of each license type, e.g., https://choosealicense.com/licenses/. Development teams should maintain a permissible list of license types alongside supply chain details.

Key takeaways

  • Embed tools into CI/CD pipelines to identify security risks early.
  • Maintain a list of dependencies; understand their purpose and document the origin.
  • Regularly monitor sources such as the CVE Database to understand risk.

Solution integrity

Solution integrity is a broad term, encompassing aspects of the former two topics. If we define integrity as the condition of being unified or sound in construction, we can derive the following attributes, where a solution:

  • Is easy to understand.
  • Is of logical construction.
  • Is testable and tested.
  • Meets specifications; and
  • Is free from defects and errors.

When a development team has full control over the source code in their solution, these attributes can be monitored and addressed easily. A mixture of coding standards, good practices, testing approaches and static analysers allow the team to build confidence in integrity. In contrast, the inclusion of third-party packages leads to challenges in maintaining integrity.

Expanding on the attributes of solution integrity, understanding each of the following areas will aid a developer in navigating the associated challenges:

Easy to understand.

When including a package, the ease at which developers can understand the source code decreases. Developers become responsible for ensuring the code they control is well architected, developed and documented.

Is of logical construction.

The design (and thus architecture) of the solution must maintain separation from the design of the third-party components. Constraining or influencing solution architecture based on a dependency reduces maintainability and further embeds the dependency.

Both testable and tested.

Solutions testability is influenced by the architecture of its dependencies. The first goal is to set clear boundaries. Then, define metrics and reporting standards. Apply caution around often misunderstood metrics such as code coverage. Finally, ensure test approaches cover functionality, rather than implementation detail.

Meets specifications.

Solution specifications should be agnostic of the functionality provided by its dependencies. That is to say, the requirements should not be constrained by the features of dependencies. Consider the long-term support/maintainability of dependencies; for example, checking the originating code repository for commit history, active pull requests and maintainers. Solution specifications are adaptable, yet a developers influence over a third-party dependency is often limited.

Free from defects or errors.

With parallels to being testable and tested, dependencies can introduce unexpected behaviours into a system. Suitable test coverage and approaches, for example automated regression tests over critical functionality, should be included to minimise risk. Specifying versions can also help reduce the risk of regression, although increases the cost of maintenance as upgrades become a more manual process. Choose dependencies based on the maturity of their solution, source, and credibility. My colleague, John Gimber, wrote a great article on catching errors early; Failing Fast, Succeed Faster. Using off the shelf tooling, such as the GitHub’s Dependabot support teams in failing fast and maintaining the integrity of their solution.

Key takeaways

  • Avoid over engineering, select the right tool (package) for the job.
  • A robust test framework increases confidence in your solution, including its integrity.

Conclusion

We’ve explored some of the considerations I employ when including packages in my solutions. The challenge is to assess whether the use of packages outweighs the risks/challenges. As engineers, our job is to make informed decisions around package inclusion, being mindful of interoperability, security, and solution integrity. Being mindful of solution integrity, regular monitoring and maintenance is essential in delivering quality secure solutions.

How do you manage packages and third-party dependencies within your ecosystem? Do you have a standard process? Share your thoughts with me on LinkedIn.

References:

Cambridge Dictionary. interoperability. [online] Available at: https://dictionary.cambridge.org/dictionary/english/interoperability. (Accessed: 8 November 2023)

GitHub Docs (2024). About the Dependency Graph. [online] Available at: https://docs.github.com/en/code-security/supply-chain-security/understanding-your-software-supply-chain/about-the-dependency-graph. (Accessed 16 Jan 2024)

‌NCSC (2021). Log4j vulnerability — what everyone needs to know. [online] www.ncsc.gov.uk. Available at: https://www.ncsc.gov.uk/information/log4j-vulnerability-what-everyone-needs-to-know. (Accessed: 8 November 23)

Disclaimer

Note: This article speaks only to my personal views/experiences is not published on behalf of Deloitte LLP and associated firms, and does not constitute professional or legal advice.

All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement

--

--

Tom Noble
Deloitte UK Engineering Blog
0 Followers

Seasoned Engineering Leader with a strong track record of driving innovation and delivering impactful solutions.