This blog post is a write-up on the Composer Cloud Resolver I’ve built over the last two years and finally announced this week at the official Contao Conference during the keynote. Jordi and Nils (the creators of Composer in case you’ve been living under a rock the last seven years) were invited as speakers and thus attended the keynote as well, even though they are not directly involved in Contao in any way at all but apparently what we’ve built with our Contao Manager (I’ll come back to that later) has impressed them and Jordi mentioned it on Twitter:
And then it all started: A few people asked for links and details so I had to blog about it. So here we are, welcome to the first blog post on my newly created blog 😃
I think before we dive into that whole Composer Cloud Resolver thing it’s important that you understand the background of it.
I’m part of the Contao core developer team. Contao is an open source content management system and pretty much exactly three years ago we’ve launched version 4 that is built on top of the Symfony full stack framework. We decided not to create everything from scratch but instead are slowly migrating old code to new one but one important decision we made at the very beginning was that we wanted to fully incorporate Composer. We’ve been using a custom built «extension repository» for third party plugins since ~2009 and for Contao 4 we didn’t want to use our own solution but instead use what’s already there, well known by PHP developers and much more powerful and reliable than our own solution.
The other thing we really wanted to change was to migrate away from having system settings in the back end of Contao. In previous versions you could change everything at runtime and thus easily crash your application. This would leave you with a broken setup and you had no way to fix it other than deleting files on the server to access the back end again. Also, having configuration at runtime meant we cannot optimize any kind of cache and e.g. removing or configuring services during service container build time.
What we really wanted was to use Symfony as a full stack framework the way it was built and for us that meant using config files and the command line. I’m really, really happy about this decision when I look back now but we’ve come a long way ever since. What happened is that over the last three years developers started loving Contao 4. Composer was now a first-class citizen and they were suddenly able to reuse thousands of packages with just a simple requirement in their «composer.json». Moreover, they could now use deployment tools thanks to all the new CLI commands that started making it into the core.
But in life, there’s always two sides of the same coin.
For us, that was our user base which never really became comfortable with the new version. The major disadvantage being that they couldn’t install and manage Contao without having to use the command line anymore. So from the very beginning, Contao 4 meant a huge transition for our users. They were used to be able to use our web installation tool and only a very little minority had already met the console before.
I admit that at the beginning, we as core developers probably did not realize that the gap would become that huge but we quickly became concious that if we didn’t do anything about it, we’ll have a hard time bringing developers and users together and that this gap would only become bigger. That was when the Contao Manager became a thing. Simply put, it is a GUI shipped as a PHAR file you can upload to your webserver and access using your browser. You can create an account and then start managing the system. The GUI is built using Vue.js and everything is executed via an API so the Contao Manager actually calls itself. What’s important is that you can also use it on command line. So let’s say you wanted to clear the Symfony cache. What would happen is that you open e.g. «https://example.com/contao-manager.phar.php», log in and click on the «rebuild cache» action. The GUI would send a request to the PHAR file via the API and the Manager then tries to fork a process in the background (to work around time limits) and in fact executes «php bin/console cache:clear». And the cool thing about this is that the Manager itself also includes Composer. So we can also execute a «composer update» like so: «php contao-manager.phar.php composer update».
There’s a lot more going on like finding out the matching PHP binary etc. but that’s not really what I’d like to blog about today. Today, I would like to blog about the famous «composer update» that may easily consume up to 1.5 GB of memory which was our major problem until this week as there’s only a little number of shared hostings that do provide that much memory. So we had a management GUI now but it still didn’t work for a lot of people due to RAM issues.
The story of the Composer Cloud Resolver
I started to think about outsourcing the resolving part that consumes a lot of memory in early 2016. I felt like the way Composer was designed would perfectly allow for something like that.
Composer consists of two main commands, «composer update» and «composer install»:
- «update» does the heavy lifting with all the dependency resolving based on your «composer.json» and writes the result of it into the «composer.lock» file. That’s the one that needs a lot of RAM.
- «install» downloads all the packages based on the «composer.lock» and installs them into your «vendor» directory. It doesn’t impose any sepcial requirements on RAM at all.
Given this design of Composer the idea is pretty obvious: Build an API that expects a «composer.json», resolves the dependencies and returns the «composer.lock» as a result.
So I started my journey almost 2.5 years ago and I’ve worked on it from time to time. It all started with the first pull request to Composer where I wanted to have a «composer update — no-install» option so I could disable the actual downloading of all the packages in the cloud as this was completely pointless. I just wanted to write the «composer.lock». But back then, Jordi didn’t accept it for very valid reasons and probably also because he thought I was insane building something like that 😃
I then left the project aside and enjoyed summer but apparently I was so obsessed by the idea that I started another attempt in October 2016 so that at least I could use the «Installer» class myself and do things programmatically rather than having a command line option and Jordi — being a super nice guy and feeling sorry for me and my obsession — merged the pull request (actually he built a better version of it) which got released with version 1.3 of Composer. 🎉
So I had managed to bring the foundation for it all into Composer itself and I continued working on it sporadically but kept on stumbling over other issues. Here’s a random short list of other things I had to consider:
- Composer Plugins can do things during the resolving process so for security reasons I had to disable plugins
- I also had to disable the autoloader and execution of scripts
- Local repositories as well as artifact ones cannot work within a cloud so I had to validate for those not being present and deny such requests
- The platform information is always different from the one in the cloud so I had to make it mandatory that this information is sent to the cloud
- I/O of Composer needed to be redirected somewhere different than the CLI output and I noticed that I not only need to provide the resulting «composer.lock» but also the Composer output. How else would one know why something didn’t work?
- The cloud would be either public or need authentication. For Contao’s case it’s public but to prevent it from being used by anybody I had to add checks for certain packages to be required in the «composer.json»
- I needed a way to pass the packages to update and options like «with-dependencies», «profile» or «prefer-lowest» but at the same time other options had to be disallowed because they are either dangerous or make no sense within a cloud such as «no-interaction» or «working-dir».
Fast forward to June 2018, where the Cloud Resolver is now finally running in production and is being used by the Contao Manager. We did it and people are now happily updating packages on shared hostings with limits ~100 MB RAM. 🎉 🙌
The Composer Cloud Resolver
So let’s get a bit more into technical stuff here and I think it’s easiest if we just look at the way the Cloud Resolver accepts jobs.
Creating a new job is as simple as sending a POST request with your «composer.json», the «composer.lock» and platform information to the «/jobs» endpoint.
I didn’t want to include the «platform» information in the «config» key of the «composer.json» for compatibility reasons. I just wanted to be sure I can manage the platform information independently from the «composer.json».
So here’s our example POST request:
The response of the Cloud Resolver is pretty straight forward:
And then that’s it. You can start sending GET requests to different resources to get the overall state of a job but also the resulting «composer.lock» once it’s finished and the console output of Composer itself.
Internally, everything is packed up into containers using Docker and distributed and orchestrated using Kubernetes.
At the moment I’m using Traefik as a load balancer in front of multiple web instances that are retrieving the jobs and pushing them to the queue.
For the queue I decided to go with Redis. Then there are n workers that get the jobs from the Redis queue, start resolving and eventually push the results to Redis again. So I’m actually using Redis for both, storing the job details but also the job queue.
So it’s Open Source, right? Where can I find it?
No, it’s not and I’m not really sure if it ever will.
I understand that we probably solved an issue for Contao every PHP project using some sort of extension/plugin repository is facing and I’d love to help solving it for everybody.
The thing is, I love Open Source Software. I’ve been working on different projects for over 10 years and I use software other people built every day. But it also has some downsides, the main one being people expecting you to do the work for them.
The Composer Resolver Cloud is not just a piece of software you can unzip and install on some server and it magically just runs. It also requires basic knowledge on infrastructure management and technologies such as Docker and Kubernetes.
It also involves setting up monitoring to manage server resources and debugging when something went wrong.
I’m afraid of all the support I would have to give and I’d probably fail to live up to my own expectations which is why I decided not to publish it (yet).
However, I’m totally open for discussions! So if you’re working on some sort of project that has the same issue like WordPress, Drupal, Typo3 etc. I think we should talk 👌
I like travelling and meeting new people! 😎