Our first eight months with GitLab
Eight months ago we wrote about Saying goodbye to GitHub. A post about our decision to migrate all our code away from GitHub over to a self-hosted deployment of GitLab.
Scaling CI builds with GitLab.
Infrastructure
Velocity is a key performance metric for our Engineering team. As our team has grown our maturity in testing has grown too. A key part of doing Continuous Deployment well is that it’s continuous. Your tooling must scale as your testing requirements grow to keep your team moving at pace.
GitLab’s built in CI tool is amazing. It’s easy to forget that we’re using a completely free piece of software. When our team starts something new, we always take the simplest path. Setting up GitLab and the infrastructure around CI was no different.
The diagram above was actually version 2 or 3. We started with a single instance running both GitLab and the CI runner. As well as being a terrible idea for redundancy, GitLab is very resource hungry. Starting builds that that consume 100% CPU alongside the main GitLab services won’t work.
From a single instance, we expanded to have the CI runner on another host. This single runner was responsible for builds across all our services. This would often mean engineers were waiting for builds to complete before their build could begin. Going back to our point about velocity, valuing the time your engineers have is critical to running a lean and productive team. Don’t make them wait for anything. We added specific hosts to manage a runner for back end builds and front end builds.
This version of the system took us a fair distance. Our team grew by another 20% and we were back to contention for build slots.
Scaling Builds
For those of you that have had experience with GitLab, I expect you’re wondering why we didn’t use GitLab’s concurrency features? Good Question. GitLab’s CI runner comes with the ability to run builds concurrently. We naively (like many before us no doubt) thought we’d turn it on and BOOM! We’d have scale in our CI infrastructure. Like everything else involving concurrency, it wasn’t that simple.
We had a long list of problems meaning concurrency wasn’t an easy option for us. Builds on the API service would fight over the test database instance. That’s no longer a problem since our migration to Docker but at the time seemed like a lot of work to solve. We also saw builds taking longer for every concurrent build running. Database migrations also presented some challenges as the order they run in is critical. These little but solvable problems continued to stack up. We went back to the drawing board.
We soon realised that we could have a different kind of concurrency by having a runner for each stage of our pipeline running on each of our CI nodes. Below is a screenshot of the pipeline for our main API service.
The build stage now has a dedicated runner on each of the CI nodes. This means that at any one time we can run N build stages at a time. Once the build job passes a test job will spin up on one of the available runners. Again up to N test jobs can run at any one time. So far this change has been great. It’s nice and stable and has yielded a significant increase in velocity for the team. A key thing to bear in mind with this approach is that you need to watch out for subtle problems that might appear from assuming the previous stage was run on the same machine.
We’re now running around 5k builds per month across all our services with a success rate upwards of 90%. We have very roughly estimated that our engineers are spending a whopping 50% less time waiting for builds.
There’s one last thing I’d like to cover before I move on. A deciding factor for us when we began thinking about moving to GitLab was cost savings. Since then our build infrastructure has grown significantly so how do we compare to Github + Circle.ci?
For ten users in our team to access GitHub costs $70 per month. That’s less than half of our entire team but I conservatively said 10 as on a day to day basis that’s currently all we’d need. The closest plan from Circle.ci would give us one concurrent build with four containers = 4x parallelism for $150 per month. So GitHub and Circle.ci weighs in at $220 per month with very little room to grow.
Our CI cluster currently consists of 1 t2.medium and 4 t2.small on demand instances from AWS. Each of the small instances costs $0.026 per hour and the medium is double that at $0.052 per hour. That comes out at around $111 per month. This means that running “on-premise” CE edition of GitLab with twice the compute power and unlimited users is 50% cheaper than GitHub + Circle.ci.
This means that Running on-premise CE edition of GitLab with twice the compute power and unlimited users is 50% cheaper than GitHub + Circle.ci. 💃
Building on top of GitLab & API
One of the things we like least about GitLab is their REST API. It’s awful to use. We’ve originally had a bot in our slack channel that was capable of triggering builds remotely. After our migration to use Environments to orchestrate deployments within our pipelines, we realised that the only way to manage the builds would be via the UI.
I believe the GitLab team are currently working on an improved version of the pipeline API but don’t quote me on that. Improved and easier to use APIs is certainly somewhere we feel GitLab could improve.
Stability with Ruby/Memory
Early on in our journey with GitLab we experienced constant issues with memory on the main GitLab server. As the day went on and the number of builds increased the ruby processes would start gobbling up memory. We tried to figure out what could be causing this but in the end concluded that giving the GitLab processes more room to breathe was the simplest option.
We also restart GitLab every evening via a cron job. Since we increased memory and forced a restart we’ve had no further issues with the main GitLab server.
Making sure there’s plenty of swap on the box is also a good idea.
Reporting and Analytics
Over the eight or so months that we’ve been using GitLab we’ve seen a steady increase in reporting abilities within the GitLab UI. It’s been critical for us to measure the impact of the changes we’re making to ensure that we’re taking the team in the right direction.
Going back to our point about the state of the API. Finding insights beyond what’s provided out of the box can be tricky.
Gitlab know how to ship.🚀
GitLabs’s continuous release cycle is extremely impressive. They are backed by a huge community of people driven to make GitLab a wonderful tool to use. Since we started using it we’ve seen eight new versions shipping some major features. Check out the change log.
If you’ve been considering trying GitLab out, either the cloud or CE versions then we’d be only too happy to say do it! Thanks for reading and don’t forget to hit the ❤️ button.