We Migrated From GitHub to GitLab
The decision’s backstage and what we’ve learned from the change
Also available in Portuguese
Here at Mercos, we had already migrated in the past our source code hosting: in 2015 we left Beanstalk and went to GitHub. We were a team of approximately 7 developers and the migration was smooth. The reason behind it was basically to have more features: Pull Requests, integrations with other services, etc.
The status quo
As years went by, we added some tools to our workflow: continuous integration, test coverage, static analysis, automatic dependency updates… But GitHub was still the same, stuck in time. We started to look at GitLab because it offered many of those tools already embedded, tools that GitHub only offered through third-party integrations that meant additional cost because they were other products.
Then Microsoft acquired GitHub and some announcements were made about technical advancements. We decided to wait a bit more and see if some of the new features could make it worth more than GitLab. Dependabot was something we had been using for some time, and the announcement that it was going to be free for GitHub users was very well received. GitHub Actions was the last straw we needed to convince ourselves that we better keep our code on GitHub.
However, theory and practice aren’t always the same. We tried GitHub Actions and it was a bit disappointment. It’s a great alternative for OpenSource projects, because it’s completely free, but for Mercos it would become very expensive very quickly.
We tested using GitHub’s own infrastructure, charging by the minute. Each team has 10.000 minutes included for free in the plan. We depleted those minutes in a couple of days, and doing the math it would be too expensive to get all the minutes we needed. A continuous integration business model that encourages developers to test less doesn’t make sense to us.
We went to the second test: Build our own infrastructure of GitHub Actions agents in AWS. It wasn’t easy: There’s no image available for GitHub Actions agents, and the best similar weighs many GB. Additionally, the registering and deregistering process in the agents is cumbersome, forcing us to consume GitHub’s API ourselves. Cost-wise, AWS machines are quite cheap, but the maintenance risk should something go wrong was too high for us.
The economic impact of 2020 events pushed us to make this change now. We had to make a big cut in cost. We evaluated again GitLab’s feature set and decided in favour of migrating. Here are the main points that made us change and what we’ve learned from it.
The decision’s backstage
GitHub was charging us $8 per user per month, while GitLab would charge us $4. We ended up choosing GitLab’s free plan, which gave it a somewhat unfair advantage over GitHub. Shortly after our migration, GitHub announced a pricing change that made their prices be very similar to GitLab’s. However, the free tier on GitHub doesn’t include basic feature like Wikis and even with similar pricing, GitLab’s offer is more advantageous.
GitLab attracted a lot of attention a while ago because of an accidental data loss incident, which took around 24h to recover, partially, customers’ data. What surprised us was their seriousness and transparency while dealing with the situation: They live streamed debugging sessions showing the steps they were taking to fix the issue and then published an extensive post-mortem with the adopted measures to prevent it from happening again.
GitHub hasn’t had an incident as serious, but over the years the feared pink unicorn was seen by many, making it impossible to access the web interface.
In an eventual downtime, regardless of choosing GitLab or GitHub, the code stays backed up in all developers’ machines, and they can keep coding locally with no problems. Regarding issues and boards, we have a strong culture of communication that could temporarily adapt to email or messaging should the platform be unavailable for a few hours. And about the deploy: we took the measures of being able to manually execute deploy scripts in SREs’ machines should our CI be unavailable. This gives us confidence that we won’t be kept hostage by the code hosting platform we choose.
GitLab has an incredible migrator. It imports not only code, but all branches, issues, wiki pages, pull requests (transforming into merge requests), and etc. It’s just click and wait. It just works.
Tip: Use GitHub’s Archive Repository tool to make sure nobody will make any changes while the migration is happening.
Tip 2: Add all users to GitLab beforehand, so the migrator can keep the same users linked to their commits, issues, comments and merge requests.
GitLab CI is friendlier to our workflow, which is heavily based on Docker. Building our own infrastructure for GitHub Actions has been quite traumatic, full of workarounds. On the other hand, GitLab Ci has an included Helm Chart, so in a matter of minutes we had all the infrastructure running in AWS.
Tip: Don’t use commands like
docker-compose in GitLab CI. Instead of
docker build, use Kaniko with registry cache, it works really well. Instead of
docker run or
docker-compose run, use the image you just build with Kaniko as the CI
image in the next step.
Tip 2: By default GitLab CI runners only accept jobs that have specific tags, which frankly doesn’t make much sense for a default. In order to make them accept all jobs, add the following in the Helm Chart configuration:
Tip 3: GitLab’s documentation about running and scaling runners in Kubernetes isn’t very good. Instead of simply spinning up more Pods, you should increase a configuration called
concurrent in the Helm Chart, letting GitLab Kubernetes Executor create Pods dynamically according to demand, and then your
cluster-autoscaler (which should be already working in the cluster) will do its magic starting new cluster nodes when needed.
Issues and Boards
Given that we chose GitLab’s free plan, we weren’t able to create multiple Boards at the Group (Organization) level. So we adapted and started to use more boards in each repository.
Another point is that GitLab’s shows on Boards always all issues, with no option to add or remove issues from the board. Each stage on the board is identified by a label, so when we drag issues across board stages, we are effectively changing their labels.
Those limitations and linking of features caused us to actually be more organised with issues and labels, which is something nice, because in GitHub lots of issues ended up being forgotten about (we had even setup a stale-bot to try to mitigate this problem)
As I mentioned in the beginning, we used Dependabot as our dependency updater in GitHub. While migrating to GitLab we discovered RenovateBot, which is even smarter (and works both with GitLab and GitHub).
It automatically identifies dependencies on multiple platforms (for instance: Python, Docker, GitLab-CI) and if any dependency isn’t fixed (pinned), it’ll firstly open an MR pinning that dependency and then after it’ll open others to update the versions.
babel. In the documentation there are lots of other configurations that can be made through
Tip: To see the bot’s execution logs, you have to login to the Dashboard. It’s a quite hidden link, but it helped us a lot to activate/deactivate repositories and also to find the reason why one or another dependency wasn’t being identified or updated.
Static analysis (Linter)
GitLab offers an embedded Code Quality analysis that’s very flexible, based on CodeClimates’s open source technology. Apart from being free, it’s very easy to setup and to define styling rules for the code. Recently GitHub announced SuperLinter, but its execution consumes GitHub Actions minutes, which would make things even more expensive.
Tip: Leave the nomenclature and styling rules in the tools’ configuration files (for instance:
.pylintrc, etc), including the rules that determine which files ought to be ignored in the analysis. Leave in
.codeclimate.yml only the configuration regarding which engines should be run. The analysis will be faster overall and you’re going to have a lot less headache when trying to debug some inconsistency.
Last comments and details
GitLab’s interface doesn’t auto-update (Hot Reload) when some new interaction happens in an issue or merge request. Having to always manually update the page to see new comments and etc has bothered us a little.
Mandatory Code Review in merge requests is a feature only available in paid plans. But given that we have a strong culture for code reviews and code quality in general at Mercos, everyone knows the process and values it, so we don’t have to enforce it in software.
All our OpenSource repositories are still on GitHub, because all the limitations we found on GitHub don’t apply to OpenSource repositories, which have access to all resources for free.
Review Apps is a feature of GitLab CI that creates, for each branch, an environment in which code from that branch gets deployed to and stays available for manual tests. We then configure for how long that environment should be active and after that time, it’ll delete it. It seems to be a very interesting feature for product people to test some small feature or fix before deploying to production, and also for dependency updates on the frontend that need some manual testing.
Deploy Boards is another feature of GitLab CI that shows inside GitLab’s interface a Kubernetes deployment being made, Pod by Pod, so every developer can follow their deploy being applied live, with no need to access other tools. Additionally, we get log monitoring for free, and if we happen to have a Prometheus installation on the cluster, we also get application monitoring charts, automatically.
The migration from GitHub to GitLab was quite smooth for Mercos. We learned that migrating source code hosting isn’t a big a monster as it could seem. We recommend GitLab for companies and teams that need to cut costs or that seek more embedded or native features in their source code hosting service.
We managed to keep all our workflow’s main features, that were previously separate services, in GitLab’s native features. We had to adapt regarding management of issues and boards, but that limitation is making us be more organised about it.
Regarding innovation, Gitlab has very interesting features when integrated with a Kubernetes cluster, which can be very useful to those that have already started to enter this world of orchestration.
For OpenSource repositories, we conclude that the best alternative is still GitHub, because all features are completely free and the vast majority of OpenSource projects are already there, making it easier for the community to contribute with the project.