Building a progress indicator for the Comm utility
Today I found myself needing to use the comm utility to compare two 20GB files. This ended up taking about 10 minutes and while it was running I got curious at how much time was left. Knowing that comm must read through both files before it finishes, I decided to see if I could build a simple progress indicator based on the read offset of one of it’s file descriptors.
I jumped straight into the
/proc filesystem and discovered the
fdinfo directory. This directory contains a file for each of the process’ file descriptors indicating the offset, file mode and mount ID. …
One parser to rule them all
Bash is a wonderful and terrible language. It can provide extremely elegant solutions to common text processing and system management tasks, but it can also drag you into the depths of convoluted workarounds to accomplish menial jobs.
Recently I ran into one of these situations when trying to parse varying inputs in a control script. The standard tools for argument parsing were breaking at every turn as I changed argument order and added flag options for flexibility. …
Use Git hooks to create your own push deployment pipeline
I recently got interested in supporting Heroku style push deployments in the AWS stack at Fullscreen, Inc. There are a few solutions for this in the wild like Dokku, but I wanted to use something flexible enough to grow with our deployment infrastructure that wasn’t dependent on container technology. The solution I came up with involves a deploy server, a Git hook and a bit of Bash.
Before I could have engineers start pushing code, I needed to setup a deployment server. The most basic setup involved a new instance with a “git” user for authenticated pushes over ssh. …
If you haven’t heard of it, AWS’ CloudFormation service allows you to define complex application stacks in JSON templates and offload all of the resource creation and management to the cloud. It’s usually a combination of your best friend and your worst enemy due to a lack of flexibility and transparency into what will happen when a template change is applied to a stack.
An especially annoying aspect of the JSON templates is that there is no way to leave breadcrumbs detailing why your resources are configured the way they are. JSON as a spec does not support comments, and in order to future proof their own service AWS doesn’t allow custom keys in resource declarations (e.g. …
One of SaltStack’s many strengths is it’s extensibility. It’s possible to write custom states, execution modules, grains, pillars, renderers and returners all in python and all with the power of SaltStack’s internal modules. Let’s take a look at writing a custom module that decorates the pillar with information from an external resource. For this exercise, we’ll create an external pillar that reads data from a Redis database.
We’ll start by creating our external pillar module file. In your Salt base or root directory, create a folder called “modules” and a sub-folder called “pillar”:
/srv/salt $ mkdir -p modules/pillar
Next, we’ll create our pillar…
Provisioning and configuring multiple nodes with a tool like SaltStack is often trivial once your recipes are well defined. Things can become more complicated though when you start to deploy code or critical system configuration changes.
The SaltStack orchestration runner is a salt runner that executes commands across targeted minions from the salt-master. Because commands are not run by your minions and orchestration can make use of the requisite system, all kinds of possibilities open up for state enforcement and error handling.
The real power of the orchestration system becomes apparent when you start dealing with code deployments. In a typical application update, new code is deployed to new or existing machines, any application requirements are installed or updated and the application is gracefully reloaded. Each of these steps represents a failure point that could leave your machines running different versions of the same code. …
The internet is rife with promises of 100% availability when using HAProxy for load balancing. THEY ARE LIES!
When you instruct HAProxy to reload it’s configuration, the following occurs:
Seems pretty straight forward but there’s a small window of time between steps 1 and 2 where neither process is bound to the configured ports and requests are rejected. …
The git commit log is one of the best tools for understanding the history and direction of a project. I’ve found there are a ton of situations where the log can make complex interactions with a repo a breeze.
Sometimes a branch you want to update can fall too far behind it’s base branch and a merge could introduce potentially unwanted commits. To find the difference in commits between two branches, use the —cherry-pick option:
git log --left-right --graph --cherry-pick --oneline master...production
< f472340 unshared commit in master branch
> 284f4b9 unshared commit in production branch
The —cherry-pick extracts commits that exist in one branch but not the other, while —left-right indicates which side the commit belongs to. …
If you haven’t heard about Ansible yet, it’s worth giving the docs a read before proceeding.
As my deployment strategies grow in complexity, I find that I really need Ansible to be aware of environment configuration. By default Ansible doesn’t provide a mechanism for managing multiple environments, but with a little hacking we can get a pretty comprehensive solution.
My directory structure looks something like this:
Each environment has it’s own directory that contains a dynamic inventory script and a group_vars directory. This allows two things to happen: