Dependency Management

During life-cycle of any software project its’ dependencies change. The traditional workflow of adding requirement is so broken that it’s hard to believe that it’s still used in 2015. The article reveals how vagga makes managing dependencies simple and non-distracting process.

Traditional Workflow

I’ll use python for example. Tools for most scripting languages are very similar. For compiled languages setup is usually different but I’ll omit them for brevity.

When we start project in python we create a “requirements.txt” file in the root of the project. Then we create a “virtualenv” for it:

$ echo django > requirements.txt
$ python3 -m venv venvdir
$ . venvdir/bin/activate
$ pip install -r requirements.txt
Downloading/unpacking django (from -r requirements.txt (line 1))
Downloading Django-1.7.4-py2.py3-none-any.whl (7.4MB): 7.4MB downloaded
Installing collected packages: django
Successfully installed django
Cleaning up...

Now we have requirements installed. What if we need to add another one? Just add a line to “requirements.txt” and re-run pip:

$ echo pygments >> requirements.txt
$ pip install -r requirements.txt
Requirement already satisfied (use —upgrade to upgrade): django in ./venvdir/lib/python3.4/site-packages (from -r requirements.txt (line 1))
Downloading/unpacking pygments (from -r requirements.txt (line 2))
Downloading Pygments-2.0.2-py3-none-any.whl (672kB): 672kB downloaded
Installing collected packages: pygments
Successfully installed pygments
Cleaning up…

Great so far. What if we remove dependency?

$ sed -i ‘/pygments/D’ requirements.txt 
$ cat requirements.txt
django
$ pip install -r requirements.txt
Requirement already satisfied (use --upgrade to upgrade): django in ./venvdir/lib/python3.4/site-packages (from -r requirements.txt (line 1))
Cleaning up...
$ python3
Python 3.4.1 (default, Sep 12 2014, 16:29:56)
[GCC 4.8.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pygments
>>>

As you can see, the package is still there. This is just a quick refresher, I’m not going to dive into details too deeply.

Problems With Traditional Workflow

Here are quick summary of what can be a problem in the real project:

  1. User must manually run package manager to update dependencies (pip install, npm install)
  2. Already installed dependencies are not deleted (probably it depends on package manager, but I observe this behavior at least with pip and npm)
  3. Hard to manage multiple versions (e.g. rapidly switching between two branches with different dependencies)
  4. The system dependencies (i.e. binary packages) are not managed at all
In fact you’re supposed to run “pip install …” after every “git pull”. But nobody does that. And that leads to additional round-trips for many bugs.

This is not even talking about handling multiple environments in single project (which is quite usual for medium to big sized projects).

So What is Vagga?

Vagga is a tool that fixes all of the above problems. It’s linux containerization tool similar to docker or lxc, but the interface is designed specifically for development environments.

Let’s see how to bootstrap a project with vagga:

$ echo django > requirements.txt ❶ 
$ cat <<YAML > vagga.yaml ❷
containers:
django: ❸
setup:
- !Ubuntu trusty ❹
- !Py3Requirements requirements.txt ❺
YAML
$ vagga _run django python3 -q ❻
[ .. snipped build process .. ]
>>> import django
>>>

What we have here:

  • ❶ We still use requirements.txt for convenience
  • ❷ This is a shell syntax to put the data starting from next line up to “YAML” text to the file “vagga.yaml”. Usually you just use text editor.
  • ❸ We name container “django”, we might use multiple containers (e.g. add mysql to another container)
  • ❹ We use ubuntu in the container (as an example), it doesn’t depend whether the host system is ubuntu, fedora, nix or whatever, the linux distribution inside this specific container is ubuntu. You should use the the one you will use in production.
  • ❺ And we tell vagga to read python dependencies from the file “requirements.txt”. This effectively means that dependencies will be installed by “pip” and a little bit more…
  • ❻ Then we just run vagga and see that everything works as expected. The underscored “_run” command is just low-level API, I’ll show better API in a minute.

See the reference for full list of build steps supported. Note that we don’t have a “build virtual environment” or “install the packages” step. We just “run the command”.

How Vagga Works

When you run command, vagga does the following:

  • reads and parses vagga.yaml;
  • computes hash of all dependencies of a container to run;
  • builds container if no one exits with hash just computed;
  • runs the command inside.

Ok, we are going to introspect some details. First let’s define a “command” so that it’s easier to use vagga:

$ cat <<YAML >> vagga.yaml
commands:
py3: !Command
container: django
run: python3 -q
YAML
$ vagga
Available Commands:
py3

To run a command simply start “vagga command_name”:

$ vagga py3
>>> import django ❶
>>> import os, sys
>>> sys.path ❷
['', '/usr/lib/python3.4', '/usr/lib/python3.4/plat-x86_64-linux-gnu', '/usr/lib/python3.4/lib-dynload', '/usr/local/lib/python3.4/dist-packages', '/usr/lib/python3/dist-packages']
>>> os.getcwd() ❸
'/work'
>>> os.listdir() ❹
['.vagga', 'vagga.yaml', 'requirements.txt']
  • ❶ Confirm that django imports here
  • ❷ Let’s look at “sys.path”. It has only system paths. I.e. the django is installed in system location (inside the container)
  • ❸ Our project directory is mounted as “/work” inside the container
  • ❹ Confirm that it’s our project directory by looking at file list, note that there is a new hidden folder “.vagga”

Okay, let’s look at “.vagga” folder:


$ ls -l .vagga
total 0
lrwxrwxrwx 1 pc users 27 Feb 11 23:54 django -> .roots/django.40e346d8/root ❶
$ cd .vagga/django
$ ls usr/local/lib/python3.4/dist-packages ❷
django
Django-1.7.4.dist-info
  • ❶ There is a “django” symlink in “.vagga” which named as a container in configuration. Symlink points to a folder with hash in the name. We also see that all versions of our container are stored in “.vagga/.roots”
  • ❷ Ensure that django is at expected location inside a container

The “.vagga/django/usr/local/lib/python3.4/dist-packages” path might be added to your IDE, to make code analysis tools work (e.g. auto-completion) on always up to date packages.

Adding a Dependency

Let’s look at workflow for adding a dependency:

$ echo pygments >> requirements.txt
$ vagga py3
[ .. snipped rebuild of container .. ]
>>> import pygments
>>>

Works like a charm! Note that we just changed a file and new environment is build for us. The “requirements.txt” may change if you have done “git pull”. You don’t need to think of this, it just works.

Let’s see what we have behind the scenes:

$ ls -l .vagga
total 0
lrwxrwxrwx 1 pc users 27 Feb 12 00:14 django -> .roots/django.0bc98218/root ❶
$ cd .vagga/django
$ ls -1 usr/local/lib/python3.4/dist-packages
django
Django-1.7.4.dist-info
pygments
Pygments-2.0.2.dist-info
$ cd ../..
$ ls -1 .vagga/.roots
django.0bc98218
django.40e346d8 ❷
  • ❶ There is a new hash
  • ❷ Old folder is still here

And as you might guess, removing dependency works well too:

$ sed -i '/pygments/D' requirements.txt 
$ cat requirements.txt
django
$ vagga py3 ❶
>>> import django
>>> import pygments ❷
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named 'pygments'
>>> quit()
$ ls -l .vagga
total 0
lrwxrwxrwx 1 pc users 27 Feb 12 00:38 django -> .roots/django.40e346d8/root ❸
  • ❶ Note: no container rebuild here, used from cache (but in case you deleted old one it will be build just like the first time)
  • ❷ Confirm no pygments installed
  • ❸ And link points to the old container hash
This powerful concept of immutable containers accessed by hash works well for any kind of dependency changes

Isn’t It Too Limited

You might think that hard-coding “pip” into a virtalization tool is wrong. But we have a tool specifically to build development environments, so in fact it’s very convenient. And this approach is not limited any more than “run any bash command” idea that Docker has. Let’s see how would we do it in a generic way:

setup:
- !Ubuntu trusty
- !UbuntuUniverse ~
- !Install [python3, python3-pip]
- !Sh "pip install -r requirements.txt"

Note, we need to enable “Universe” in ubuntu, it’s where pithon3-pip package is, and install pip itself. Well, apparently we need more:

  • ❶ Rebuild container when “requirements.txt” change
  • ❷ Remove “pip” after installation
  • ❸ Cache packages between subsequent builds of containers (so that rebuild is quick)
  • ❹ Install tools to fetch/build packages (including python-dev and for example “git” if there is a git link for a package)

Here is how we can achieve it using lower-level commands:

setup:
- !Ubuntu trusty
- !UbuntuUniverse ~
- !Install [python3]
- !BuildDeps [❷python3-pip, python3-dev, git❹]
- !CacheDirs pip-cache: /tmp/pip-cache ❸
- !Depend requirements.txt ❶
- !Sh "❸PIP_DOWNLOAD_CACHE=/tmp/pip-cache
pip -r requirements.txt"

Config seems to be cluttered and hard to remember. So we chose simplicity for the common case. Anyway having the low-level tools to do something very project-specific.

Conclusion

Vagga makes dependency management a breeze. This is accomplished by rebuilding container on each dependency change and by caching packages and version tracking behind the scenes.

Unfortunately the only competitor for vagga in this field is nix which requires learning another language and a lot more boilerplate for each dependency. While vagga uses defacto standard way of declarating dependencies for each programming language (e.g. “requirements.txt” for python, other languages are coming soon).

We have touched only small part of vagga’s functionality. There are tools to run multiple processes simultaneously and for testing network tolerance, which make development process more friendly.

Show your support

Clapping shows how much you appreciated PaulColomiets’s story.