Using Docker Containers As Development Machines
Preface
This post assumes that you have some basic understanding of Docker, Docker Compose, and the key terms used in the ecosystem. Should you need to get up to speed, the “Get Started” section of Docker Docs is a great place to start. It is also the first post of a 2-part series on our experience experimenting with Docker to containerize applications at Rate.
In part 1 (this post), we will talk about the steps we took to turn Docker containers into development machines, and the lessons we learnt along the way. In part 2, we will talk about how we use Docker containers to run a distributed application and improve our testing workflow.
Our Motivations
Developers usually have to download a number of tools to set up a dev environment. For a simple web server, this means downloading and installing the language library, a database client, external CLI tools to perform database migrations, a GUI code editor, a GUI database client. To further complicate things, developers may also be using different machines with different OSes. This has a high chance of causing cross-platform compatibility issues that may arise during initial setup or daily workflows.
A migration script that executes fine in the Terminal of a MacBook is likely to give problems when being run in Windows Powershell or Command Prompt. We have personally experienced issues like this, which are often not straightforward to solve. The causes range from the unobvious, such as differences in character encoding between Terminal and Powershell, to more salient causes like CRLF/LF conversions.
We also wanted to simplify the initial setup process for all our applications across the board. This will speed up the onboarding process for new engineers who join our team.
Lastly, we wanted multiple developers to be able to run integration tests locally without having to rely on a single shared remote staging instance. This will allow different developers to perform integration tests on different versions/branches of an application without them having to take turns hosting it on our staging instance. Of course, some tests require testing on a remote instance to best replicate the production environment. But for simpler tests, developers shouldn’t need to rely on the staging instance.
These points can be summarised into the following requirements:
- Alleviate cross-platform compatibility issues.
- Simplify the setup of dev environments.
- Allow developers to conduct independent integration tests.
The Idea
In looking for a solution, we realised that Docker might fit the bill. From the way we saw it, using containers as development machines will allow developers to get started with minimal setup. In principle, the development environment would be abstracted from the host OS by having it run in a container.
This allows developers to work on a common container configuration that runs on the same OS and toolset, thereby eliminating cross-platform compatibility issues almost completely. This meets our 1st requirement.
In theory, developers would only need to download Docker and a text editor of their choice, and not have to install external tools and dependencies. Code edits will be done from the editor as per normal and the changes will be tracked and propagated from host to the container. This simplifies initial setup, which meets our 2nd requirement.
In fact, using a running container for development is optional. Developers may just use some containers to develop on the service that he/she is working on, while the other containers will just be used to host and run the applications. This gives developers the ability to conduct on-demand integration tests by spinning up containers for the required services, satisfying our 3rd requirement.
However, in practice, it was not as seamless and straightforward as we thought it would be, as we will see in the following text.
The Exploration
Our sample application
To provide a context for the following problems, let’s assume we are containerizing a simple Go web server. We will run it as a multi-container app and orchestrate it using Docker Compose. This app will consist of 2 services, 1 for our server & 1 for our database.
Compose file
Dockerfile
Achieving bidirectional file sync
The first order of business is to ensure that any code edits we do on the host machine are automatically propagated to the container. This makes the development experience feel more natural.
This is only possible through bind mounting, which works similar to a Linux mount. When we mount a path in the host to a path in the container, the contents of the host directory will completely overwrite whatever is in the container directory, regardless of whether the container directory has files which were not present in the host directory at mount time. The result is that the container directory will be an exact snapshot of the host directory.
Problem: No dependencies/Outdated dependencies
Herein lies our first issue. Up to this point, the new developer has only cloned the project and has not installed any dependencies. Because of our initial requirement of not requiring the user to download tools, the developer is not able to run glide install
. Hence, the host directory will either not have dependency folders, or they will be outdated. This erroneous state of dependencies is then replicated to the container.
Even if we install dependencies during the image build step as an instruction in our Dockerfile
, they will have no effect as the folders will be overwritten by bind mount. This means we are not able to compile and run the server once a container is created as it does not have the full set of dependencies. This would defeat our purpose of using these containers in the first place.
Therefore, we need to prevent dependency folders which have been installed during image build from being overwritten by bind mounting, which can be done in multiple ways.
Possible solution: Bind dependencies to named volumes
Doing this will make these folders immune to the effect of bind mounting. They will instead pull and push from the data stored in the attached named volumes. The drawback is that these folders will be transparent to the bind mounting, so they will not be synced from the container to host either.
In fact, this is not what we want. We need dependencies folder to exist on the host so that our code editor will not show linting errors on our existing import statements for external libraries.
Of course, if you don’t find that this hinders your development experience, then you are good to go, as the dependencies already exist in the container anyway and the server would be compilable there. But if you wish to solve this, that means we need to find a way to fulfil the following 3 criteria:
- Bind mount a host directory that has no dependencies.
- Dependencies must be installed in the container generated from the image.
- Dependencies must then be synced back to the host directory.
Which means we need to try another approach.
Realistic solution: Install container dependencies to a cache folder
This is done by installing container dependencies to a directory that is outside the mounting destination. This has the same effect as the previous solution where dependency folders are unaffected by bind mounting. For example, if you are mounting to /app_workdir
, then install to /dependencies
.
After that, copy over all contents of /dependencies
to /app_workdir
. Thanks to the bind mount, our dependencies will now appear on the host too. But… there’s a caveat. The copy operation from /cache
to /app_workdir
is done by SSH-ing into the container and running cp SOURCE DEST
. For big projects with many dependencies, copying can take as long as ~10 mins. This may vary depending on the performance of your host machine, but in essence, you have to be mindful of this drawback.
Alternative solution: Install dependencies on container start
Since container creation takes place after volume mounting, we can install dependencies at this step without them being overwritten after. It keeps the dependency installation process in the container yet makes them available on the host side thanks to the bind mount. However, doing this also lengthens the time it takes for a container to start.
Now, keep in mind that the time it takes to copy in solution 1 and to install in solution 2 both increase with the number of dependencies you have.
QoL improvement: Live reload
Now that we’ve solved the dependency problem, we are ready to develop on Docker. In fact, we can further improve the development experience by enabling live reload in our container.
For a Go application, you can use realize to do this. Running realize start
in the app root will cause it to build and run the application entry point and track any changes to .go
files (Note: remember to use polling mode. See here). Now we can write code on the host and be sure that the container will detect the change and automatically restart the server.
Conclusion
Cons
In order for containers to achieve the same level of productivity as native development environments, we need to perform 1 of the 2 available workarounds mentioned above to solve dependency problems. The amount of time it takes for us to install/copy dependencies in both solutions will increase as our project gets larger. This means that we have introduced a slowdown in the development process even though the initial environment setup is simplified.
Another tradeoff is that now every command you run on the traditional non-docker environment will need to be run inside the container by SSH-ing into it. These commands are also likely to run slower in the container than on the host machine.
Pros
The experience you get from setting up Docker will be a good preamble to using it as a deployment tool. You will also have a decent understanding of the nuances and limitations of Docker. This brings us to the question: Where does Docker really shine?
What we have realized after all the trial and error is that Docker containers are best suited for developers to run self-hosted applications quickly and easily. Developers can then test their code by connecting to these local instances instead of connecting to remotely hosted instances.
In Part 2, we will explore how we can use Docker containers to run a distributed application with ease.
References
- Exclude folders from bind mounting — https://stackoverflow.com/a/37898591/5832081
- Install dependencies to external cache — https://stackoverflow.com/a/52092711/5832081
- Filesystem changes from host not triggering live reload in container — https://stackoverflow.com/questions/4231243/inotify-with-nfs/52439741#52439741