All Aboard the Deploy Train, Stop 1 — Anyone for tea?

Aaron Kalair
3 min readMar 30, 2018

--

At Conversocial we’ve recently finished replacing our deployment tool with one more suited for the place we’ve found ourselves in.

This series of posts will discuss what the old tool looked like, the issues with it, and what we envisioned a next generation tool looking like. Then we’ll look at how we designed, built, validated and released this tool.

Some Background

Conversocial is a social media customer service platform. We import content from social networks for our customers and provide them with a tool for their support agents to respond to messages efficiently.

We have a monolith Django app on the backend responsible for handling API requests sent from our frontend, which is the usual mix of Javascript, CSS and HTML.

On top of this we have “poller” processes that pull in content from social networks, and “worker” processes that handle periodic tasks and responding to events that happen in the system such as updating our analytics datastore when agents interact with customer queries and prioritising content when it arrives in the platform.

The Existing Tool

Previously we had 3 main classes of server: workers, pollers and webs. These ran the process types defined above.

To define a process that ran on our infrastructure we used a Chef cookbook that pulled databags (blobs of JSON stored on your chef server that nodes running chef can pull down and use) containing definitions of processes that looked like:

{   cb_roles: worker / poller / web # What type of server should this         process run on   start_command: ./manage.py run_service get_tweets # command to start the process   working_dir: /srv/conversocial # Working directory for the process   checks: { memory_checks: … } # Monitor the memory usage of the process and kill it if it goes out of control}

And installed them on the appropriate class of server running under the Bluepill process supervisor.

Bluepill ensures a process is running and if it stops, restarts it as well as providing some nice additional features such as checks on a process’s memory usage.

Updating the running code was done with a tool we’d written called “Kettle” that took a class of servers you want to deploy to, and a git commit and updated the code by SSH’ing into the relevant servers and doing:

  • A git checkout of the commit being deployed
  • A pip install on the requirements file
  • A bluepill restarton the process

There was also a “static” deploy option which updated the Javascript and CSS files which powered the frontend.

Selecting the static deploy option would run our frontend build process, which took around 5–10 mins, and then scp the resulting files on to the web servers.

The deployment flow looked like:

  • Open a PR against a “dev” branch and get it approved by another developer
  • Ensure the tests for your PR pass
  • Announce you want to deploy in a Slack room and negotiate with anyone else already deploying for a slot
  • Merge your PR
  • Wait for Jenkins to test the resulting merge (we referred to this internally as the “dev” build)
  • If it passes, merge into the dev branch, if it fails either retry the build if you suspect it was a flake or fix the issue either by reverting first to let other people deploy there changes or by blocking anyone else from merging whilst you fix the issue
  • Merge the dev branch into the master branch
  • Use Kettle to deploy your code to the appropriate set of servers
A completed rollout in Kettle

Supporting all of this was a single-node Jenkins setup that ran the tests for open pull requests and merges into a development branch prior to a deploy. The jobs for testing PRs and the merge into dev were different but we never saw any major differences in the results.

That’s the state of where things were before we embarked on this project, next time we’ll look at what we wanted to improve and how we started to do it.

Follow me on Twitter @AaronKalair

Part 2 — https://medium.com/@AaronKalair/all-aboard-the-deploy-train-stop-2-more-jenkins-for-everyone-8b585e4239f9

Part 3 — https://medium.com/@AaronKalair/all-aboard-the-deploy-train-stop-3-peas-anyone-772d46a8b7ed

Part 4 — https://medium.com/@AaronKalair/all-aboard-the-deploy-train-stop-4-iterating-on-the-ui-b26e1962083f

Part 5 — https://medium.com/@AaronKalair/all-aboard-the-deploy-train-stop-5-arriving-at-the-destination-18e7dd019a03

--

--