Two years of monorepo hand-on experience

Pavlik Kiselev
ING Blog
Published in
9 min readFeb 10, 2022

Analysis of famous pros and cons of monorepo approach after two years of using it. Tears or happiness?

The target audience of this article is developers either aiming to use monorepo or already using it but not for a long time.

What is monorepo?

The idea of monorepo is not new. Instead of having one project in one Git repository (monolith) or multiple packages in multiple Git repositories (a.k.a. microservices or multi-repo), let’s put all our packages in one Git repo.

More about it can be found on Wiki (of course) or in some interesting articles (one on TopTal for front-end projects and another on Medium asking you not to do it).

Our situation

We don’t have a single repository for the entire organization. It’s not even a monorepo for all front-end code. It’s a monorepo for our team of fewer than 10 people. This is quite a significant difference but more about it later.

We started two years ago with 5 packages. Now we have more than 30. We survived a few migrations like Gitlab CI to Azure DevOps or storybook 5 with webpack to storybook 6 with Rollup.js.

I will compare the advantages and disadvantages of public sources I mentioned above (Wiki and the article not to do it) with my own experience and add lessons I learned along the way.

Disadvantages from public sources above

Here are the points mentioned in the articles above:

  1. Loss of version information (from Wiki): one version across all the packages. Meaning, no matter what’s changed and where, all the packages get a new version.
  2. Lack of per-project access control (from Wiki): access control is set per repository. If all the projects are in one repository, developers have access either to all of them or to none.
  3. More storage is needed by default (from Wiki): you need to download the code of all the packages, even if you don’t work with it.
  4. Tight coupling and OSS (from “Monorepos: Please don’t!” article): because everything is in one repository, the code cannot be extracted to GitHub for example
  5. Scalability challenges (from Wiki) / VCS scalability (from “Monorepos: Please don’t!” article). Git struggles to keep up with the gigabytes (sometimes terabytes of code)

Now I will go over them one by one and compare them with my own experience.

Loss of version information

Up to you. Really. We use Lerna to manage our monorepo. Lerna has a flexible configuration. One of the options of managing the monorepo it gives is not to lock the versions. Then the versioning happens independently for each package, and semantic versioning on a per-project basis is not lost.

Lack of per-project access control

Very true. I have never seen a system like GitHub, Gitlab, Azure, or any other where the access control can go deeper than per repository. However, since each team member must have access to all projects, this is not a problem for us.

More storage is needed by default

Debatable. I’m a front-end developer. And the biggest space usage comes not from the project but from the node modules. Especially taking into account that a big chunk of the packages is development-only. Libraries like Karma and Webpack each can be bigger than your entire codebase.

And now comes another significant benefit or lerna. It can hoist common devDependencies (actually, production dependencies too) of all packages, which remarkably reduces the space. For example, we have more than 30 packages in our monorepo. Each package has, on average, 200MB of node modules. Thus, the savings for us are around 6GB which is enormous in my opinion.

The overall situation is changing with PNPM and other package managers that link dependencies instead of copying them, like lerna.

Tight coupling and OSS

Very questionable. It’s not a problem to extract a package from the monorepo and put it on Github.

Bonus: Short-term thinking (part of “Monorepos: Please don’t!”)

Quite the opposite. During the last two years, our own approaches changed. We found new tools and methods to build the packages more enjoyably and reliably. Of course, this causes changes to the existing packages. We constantly spot new similarities that can be abstracted.

I guess the point of less thinking “long-term” is valid from this perspective. In theory, a lot of found abstractions could be thought of in the very beginning.

However, in my opinion, it cannot be more “long-term” thinking than preparing that the future will bring changes. It comes along with the “agile” mindset — no plan is perfect. Being able to introduce significant changes cheap and fast with monorepo is more important than trying to come up with the ideal design.

Scalability challenges

Not applicable. This comes again from the fact that our monorepo is for a single team. Therefore, committing, building, and publishing the packages all happen in a reasonable time. Besides this, many CIs provides possibilities to set up multiple pipelines for your Git repo that run depending on the changed folders.

However, what about the situation, when this is a monorepo for the whole organization or all frontend teams. Well, indeed, it’s a challenge at the moment, but work in this direction goes with incredible pace. Only a month ago there was an article on GitHub blog about a recent Git feature sparse index.

Personal or unexpected disadvantages

However, there are no silver bullets, only trade-offs, and a monorepo, even for a single team, is not an exception. We encountered some new specific problems:

  1. Lack of organizational adoption and tooling
  2. The overall popularity of the approach and ecosystem

Lack of organizational adoption and tooling

This is something quite generic but should not be overlooked or underestimated because of this. The idea is quite simple — there is a default way of working (like a tradition) in every organization. Even though nothing stops you (including the organization) from doing something different, it should be taken into account that the ratio of prepared tools is proportional to the internal usage of the approach.

A few teams use monorepo as we do, but more than ninety percent do not. This is reflected everywhere: in the number of available libraries for building or testing, or publishing. In the discoverability of the published components. In the number of utility scripts of other teams that drastically help in the lifecycle management of the components. Basically, the development and support of such things will lie on the team’s shoulders with monorepo.

I don’t mean to complain or anything. It’s just the reality that people should be ready for when they introduce monorepo.

The overall popularity of the approach

This is quite similar to the first one, but the source of information is external. The ecosystem is quite rich, and there are many ready recipes for all kinds of tasks.

This is more about new developers joining the team. Because the approach is not universal, it takes time to grasp the workflow.

Public advantages

Now I want to focus on the pros of monorepos. Again, we will start simply from the list from Wiki.

  • Ease of code reuse
  • Simplified dependency management
  • Atomic commits
  • Large-scale code refactoring
  • Collaboration across teams

Ease of code reuse

This one is like syntax sugar on the level of code organization. It can happen that you find a common logic in two different components. One of the ways to improve the quality of such code is to extract the common logic to a separate package and use it in the components mentioned above. But for this, you will need to make a repository project, initiate/put the common code there, publish the component, and pick it up in the mentioned components. Compare it to the extraction of the code from two files within one project. You need to create a file with shared logic and use this logic in the files. No separate repository required, no building, no publishing, no picking up. That’s like orders of magnitude more accessible and faster. Because of such simplicity, it happens way more often.

But there is more: since all the packages are always available, their code can be found with regular “search.” This improves discoverability. Now the functionality can be found even if it’s not converted to a separate package with a clear description in README.

Simplified dependency management

This can be the case, but only if you have one monorepo for the whole organization. Others install our components via the registry, and we install their components via registry, so there is no difference.

Atomic commits

This one again is quite useful. Changing interfaces in one go between dependency and host is easy.

Large-scale code refactoring

This one is very true. We used jscodeshift transformers to migrate all our packages from one dependency version to another, having hundreds of automatically changed files at once. That would be a lot of manual changes. All this was done in a separate branch while we kept working on new features in these packages.

Even if we ran the codemods for all 30 repos at once, jumping over the 30 repositories and making tens commits in each for required manual changes would be multiple times more difficult than doing everything in one repo.

Collaboration across teams

Not applicable. Monorepo is only within our team (and probably some others).

Personal or unexpected advantages

These things turned out to be good, even though I did know about them before implementing the monorepo

  • One project in Gitlab/Github/Azure
  • Small-scale code refactoring
  • Better quality control
  • Experience with Lerna

One project in Gitlab/Github/Azure or no more ~0.17 opened merge request per project

Having 30+ projects with 0-5 issues everywhere is 5 merge requests for all these 30+ projects is a task to manage by itself. Many times the MRs are not merged because they are forgotten. And at the time when we found them, they were no use anymore. The same goes for issues that could quietly wait for months for some attention. With a monorepo, all issues and merge requests are in the same project. It makes it immensely easier to manage and keep track. This will be different if you have one monorepo for the entire organization.

Small-scale code refactoring or boy scout rule

This is small pleasantness to a large-scale code refactoring. Actually, even if you want to refactor a few tests while working on a completely different package, it’s still possible and straightforward to do. Boy scout rule on it’s fullest.

Better quality control or unexpected advantage of disadvantage

When we implemented a publishing pipeline for monorepo, it ran all the tests for all the packages. This was not what we wanted because running the tests for a component that we did not touch was pointless. So I was constantly looking for a way to fix this. Locally, we could provide a pattern for the paths to run the tests, but we could not in the build.

One day our pipeline stopped working on the tests step. The component that we changed was happily passing all the tests, but another component threw an error. During the investigation, we discovered that locally the situation is the same.

The reason turned out quite obvious. We changed the component’s interface and fixed the tests, but other components got incompatible with it. An example of such change can be a DOM event name dispatched by component.

In other words, if there is a need for atomic change between packages, running all tests at once in monorepo can help to find this out.

At the moment, we proudly run all the tests always, even during the development.

Experience with Lerna or is there is difference

Working with lerna is nice. Its interface is clear and straightforward. Most of the time, you will need only three commands: lerna bootstrap to install the dependencies, lerna version to bump the versions according to conventional commits, and lerna publish to publish your packages in the registry.

On top of it, the NPM packages of lerna itself can be helpful for your own scripts. For example, there is a lerna/project with “getPackages” method to get the list of all packages with additional info from their package.json.

Сonclusion and recommendations if you want to try

Perhaps my opinion about the monorepo is quite biased because I like it and use it in all of my pet projects and at ING. However, so far, the experience is absolutely positive. It saves time and improves the overall quality of the codebase.

One of the things that will help for the adoption of the idea is education. Share the resources about it with your peers, make presentations and give talks. It will help others to understand it better — within the team, developers will be more comfortable working with it, outside of the team, they will know how to adapt their tools to be compatible with a monorepo.

Another one is to connect with other developers that use a monorepo: at your work but also outside it. Many tasks that you can face can already be solved by others. Or they can share their experience and prevent you from stepping on a rake.

--

--