Salt vs. Chef, Fabric

Trying to replace stuff with newer stuff

5 min readJun 18, 2013

Since cfengine (and probably since there exists shell scripting), many companies have tried to develop the Graal in terms of provisioning and interactive/automated deploys. Some of the hot players in the field are Chef from Opscode and Puppet from Puppet Labs. Salt is a relative new comer in the game and I wanted to give it a try for my automated deployments and my provisioning.

This post will serve as an aide-mémoire of the current state of Salt, a python framework that aims to replace Chef for provisioning and Fabric for interactive runs. Of course, things are changing quickly and are probably already wrong at the time of your reading. YMMV ;)

Our current deployment infrastructure relies on Fabric + a Chef library to extract the information about what to deploy where. Fabric is mostly a stateful ssh multiplexer for us. Because we deploy our application on all servers of our grid, the fact that fabric forks a process per server (this can be changed, but can make deploys slightly slower) can be a bit of pain, sometimes. Also I was not satisfied with the way we log stuff when running Fabric (we need to fetch all output and reassemble it to store it in a database for further review).

Salt promises it can replace Chef and Fabric by being more powerful (than Fabric) and much easier to understand (than a ruby based DSL).

Documentation

Salt documentation is extensive. There is a lot of documentation to read and understand as Salt manipulates concepts which are different from the usual stuff you can read. At first, the concept of grains, pillars, states (low, high, over) and modules will look a bit alien, but after some practice, this is a very efficient way to “mix” the provisioning and the deploy feature in a nice package.

Outside the main documentation which is generated from the source code (be careful, your installed version is probably a bit older than the online documentation), there exists a Google group and an IRC chat room. The people on the IRC are very helpful.

The whole project is hosted on Github and the maintainer of the project is extremely responsive and nice.

Installation

Based on 0MQ and written in python, the company behind Salt provides packages for most linux distribution, windows and OSX. The installation is straighforward and there is no “hard” dependency (such as solr, or RabbitMQ). This is a very good point as it enables anyone to start playing with Salt in a matter of minutes. Once installed, the master runs a service and the minions also do. These services are connected by a 0MQ authenticated connection which is used to exchange data and command.

Minions registration is done like in Puppet using a master confirmation and key exchange. Once this is done, it’s very easy to run a given “command” on all minions or a subset of them.

The default installation provides an Ohai equivalent of the Chef registry (in the builtins grains) and you can start doing a lot of things in parallel on all the minions at once. Because the minions are always connected to the master, this is very very fast. Much faster than with Fabric, where all target nodes must be connected in SSH before doing anything.

Usage

Salt basics are complicated at first, because most examples revolve around interactive command running. When you are looking for something to deploy your grid, most of the examples will seem not automated enough.

After some documentation reading, I started to understand the difference between states and modules. You know you have to write something, but is this going to be a state file, a python state file, or module python functions ? It depends. Mostly. After a while, I suppose this becomes more natural, but you usually have the choice between a very smart module function (that can use a lot of internal state) and the regular sls files into which there is limited control of the “execution flow” (or even a python state file). Some states implement a watch feature that will allow trigger-like firing when something happens but this is not something that’s applicable for all states functions. Soon, you’ll want something more clever, and you’ll probably have to switch to a function. This duality between modules and states (sls and py) is something very powerful and makes me think of the Provider vs Recipe in Chef. You start working with states, and when you want something more powerful, you can rewrite this in a single, or multiples functions and/or modules. However, because the expression language used for these things is not the same, you have to rewrite them (whereas in Chef, the languages are closer).

So far, what’s missing ?

Not a lot. As a simple Chef replacement, it seems most things are in place. Salt designers have choosen a simple and efficient approach, so you probably have more things to write than with Chef, but this is not obvious at first sight. You must upload your modules and everything from the master node and cannot delegate this to another administration node, which is kinda annoying when used to that with Chef.

The query language to search in grains, pillar and the rest is a bit limited for now and it’s easy to run into missing stuff, but this can probably be worked around quite simply.

As a Fabric replacement, I’m more reserved. The overstate mechanism that allows for dependency and sequential run is very simple and limited because of the “DSL” it does impose. The documentation claims that the DSL is mostly a hash/array yaml description of what’s going to be called with which parameters, but there is no trivial example of how to write this code differently (for example to leverage Salt features with traditionnal Python stateful code). There is not state available in the overstate and no loop. If one steps fails, the whole process aborts.

Salt does not deals with interactivity (and this is something I believe Fabric is very good at) and that doesn’t bother me. But Salt also eats the logs by default, giving you very small feedback (mostly the states/modules output which is kinda normalized in the code, but never enforced). I think that a deployment/provisioning tool should be able to centralize all logs simply because this is one of the important part of the tool.

And now ?

I’ve started asking questions to the Salt list and I was very well welcomed. Next step will be to find out if there are answers for me in Salt…

Please do not hesitate to give me some feedback on this article. It contains several spelling mistakes I’ve left here for your good pleasure. Please report them :)

The green tree python comes from flickr.