How I migrated from multi-repository to mono-repository in one day

How I migrated 20K+ commits in 15+ repositories into one repository with a new build process

Eugene Obrezkov
Eugene Obrezkov
5 min readJul 29, 2018

--

Preface

We had a lot of repositories for different services. Each repository has its own Dockerfile, tests, lint rules, etc…

Turns out, it’s hard to maintain, especially when you have dependent repositories across. I.e. you have repository api that is using a package from another repository, let’s say commons. If you publish an update in commons, you need to go trough all the dependent repositories and update commons there.

Now, just imagine, how long it takes, to make a clone of each repository, make an update there and push changes back to remote. It’s hard to say for me, but these kinds of updates were leading to half a day work just for updating the changes in other repositories. So, we decided to allocate resources for changing that.

But, before I started migration to mono repository, I spent some time to investigate pros and cons of other alternatives.

What to choose?

Multi-repo

You have a lot of repositories for each service.

Image taken from https://github.com/boltpkg/bolt#introduction

Pros:

  • You, as an admin of organization, can easily manage access to different parts of your platform. i.e. you do not want to allow access to api repository for frontend developers and so on;
  • CI/CD is easier to implement, because all you need is just some configuration file in root of your project, which will trigger build job every time you make a commit;

Cons:

  • In case, some of your repositories are dependent on each other, a single change in a repository can lead to required updates of others;

Mono-repo

You have only one repository, where all the services are handled.

Image taken from https://github.com/boltpkg/bolt#introduction

Pros:

  • You still have micro-services, but all of them are located in the same folder, of the same repository. So, if you are working on a big feature that requires changes in several services, it’s easier to make them in one repository, but not in the bunch of them;

Cons:

  • Opposite of multi repository, you can not disallow/allow access to different parts of your platform. If you give access to a repository, you are giving an access to all of the source code;
  • Another opposite to multi repository — CI/CD. Every commit in your mono repository will trigger the build of every line of the code of every service in your mono repository (we will talk more about it later);

Meta-repo

You still have multi repository, but, in addition, you have an abstract (meta) repository where all the repositories are combined into one.

Pros:

  • ?

Cons:

  • ?

Honestly, I tried several tools for meta repositories and could’t find neither pros nor cons. All I can say about the concept of meta repositories is that tooling is not ready for such kind of repositories.

As you see, each of them has its own pros and cons, but turns out, that for our case in elastic.io mono-repository suits best.

Migrating repositories into mono repository

So, the question #1 you will definitely face if you’re going to migrate repositories as well — how do you keep the history and do not lose your sanity by doing a lot of copy-paste-merge-do-again job?

Well, git has some tools exactly for this job — git subtree.

Let me show you an example of merging two repositories into one. Let’s say, you have a service, called api which is stored in api repository. The same applies to, let’s say, frontend service\repository.

You want to merge api and frontend repositories into new mono repository, called my-awesome-mono-repo.

cd my-awesome-mono-repo
git init # do not be afraid, you can make a clean git repository
git subtree add -p src/api github.com:org/api.git master
git subtree add -p src/frontend github.com:org/frontend.git master

git subtree add <remote> <revision> means, take the whole tree from remote repository and add it to my current repository.
-p means, prepend all the changes in remote tree, as if it happened in my current repository in another folder.

As a result of those commands, you will get the folders src/api and src/frontend. The entire history of all the changes for those projects will be stored and you will be able to do git blame or whatever you want.

That way, I migrated all our services into one repository with saving the history of changes.

Now, how do we make a build process for our mono repository?

Preparing build process for mono repository

As I said before, we had a lot of repositories with their own builds, tests, etc… Also, each commit in mono repository will trigger the build of everything, not only the changed part. So, how did I configure the build process for mono repository then?

Turns out, it’s pretty simple.

First of all, before running any build, I must ensure, that the changes in the commit you have pushed to remote are related to the service I want to build. It can be easily achieved via git diff command.

git diff --name-only HEAD^ HEAD | grep "^src/your_project_name_here"

The command compares previous commit with the current commit and prints only the filenames of files that were changed. Afterwards, I can use grep to check if changes were related to a certain service.

That way, I have implemented some kind of filtering on our CI servers. When a basic environment is spinning up, it runs my Bash script to check, if the environment should expand further for running tests and build or it can just skip the whole build job, since it does not have any changes in it.

Epilogue

All I can say about the migration and our experience with mono repository at elastic.io — it is good enough. We have more problems with CI/CD, but we do not have screams from the team anymore (especially, when they are updating 15+ repositories because of some change in a single one).

Follow me on Twitter, Facebook, GitHub, feel free to ask any questions.

Eugene Obrezkov, Senior Software Engineer at elastic.io, Kyiv, Ukraine.

--

--

Eugene Obrezkov
Eugene Obrezkov

Software Engineer · elastic.io · JavaScript · DevOps · Developer Tools · SDKs · Compilers · Operating Systems