Functional Abstractions for CI

Kynan Rilee
Koki
Published in
4 min readJan 14, 2018

Continuous Integration (CI) is the practice of automatically running builds and tests to ensure the integrity of a codebase. What does this mean? Whenever someone proposes a change, CI quickly checks whether the change is OK.

  • CI runs for individual changes, so it’s much easier to identify the cause of a breakage and fix it.
  • CI runs automatically. Once it’s set up, you don’t have to think about it to use it.

CI improves development velocity by giving developers the information they need to fix bugs early — without adding extra steps to their development workflow.

The benefits of CI are clear. How do you set it up? Typically, it looks like a small program that runs for each proposed change to a codebase. This program compiles the codebase into artifacts to test and publish. It’s often a simple pipeline, but not always. Today’s variety of CI languages reflects a broad range of complexity.

Pipelines can get complicated. (src)

The key to a good CI language is creating abstractions that allow a CI pipeline to evolve over time. Let’s begin with a simple pipeline and explore how to add functionality without decreasing its maintainability.

The Simplest Pipeline

Suppose the codebase is a Go library. Before allowing changes to be committed, we want to make sure the codebase will still compile successfully and pass all tests:

  • Step 1: Build the library.
  • Step 2: Run the tests.

These two steps form a simple program that runs each time someone proposes a change to the codebase.

The First Abstraction

As our needs grow, the program can become more complex. For example, if we want to build and test using different compiler versions:

  • Step 1: Build the library with Go 1.8.
  • Step 2: Run the tests with Go 1.8.
  • Step 3: Build the library with Go 1.9.
  • Step 4: Run the tests with Go 1.9.

We added one simple concern, and the program became twice as complex. Do this a few more times, and the program isn’t manageable any more. Suppose we parameterized the original program and ran it twice, once for each version of Go:

  • Step 1: Run the pipeline with Go 1.8.
  • Step 2: Run the pipeline with Go 1.9.

Instead of four steps, there are two. We’ve accomplished this by hiding some complexity from ourselves — by putting a multi-step process behind the name “Run the pipeline with X”. This is called abstraction.

Benefits of Abstraction

Because abstraction hides information, it can be used to document which details are important and which ones are not. In the example above, the abstraction says the number of steps in the pipeline doesn’t matter. What matters is that we’re running two identical pipelines, one for each Go version.

The strongest abstractions hide information so thoroughly that you can get away with completely ignoring what’s hidden. An example is “pure” functions, which don’t produce any observable output aside from their return values. If you know the return value, you never need to know how it was produced.

Another benefit of abstraction is that hiding details makes different things look the same. This makes it possible to use generic code. Imagine a generic mechanism for running multiple actions in parallel. We’ve already hidden the details of a build-and-test pipeline so it looks like a single “action”. We can get parallel execution for free:

  • Run these actions in parallel: [Run the pipeline with Go 1.8, Run the pipeline with Go 1.9]

We wouldn’t be able to use this mechanism if we couldn’t talk about each sub-pipeline as a single piece. “Run steps 1 and 2 in parallel” makes sense as a generic function. But there’s no built-in function for “Run steps 1 and 3 in parallel and steps 2 and 4 in parallel, and make sure step 2 starts when step 1 finishes and step 4 starts when step 3 finishes.”.

If “running actions in parallel” is itself an action, then we can “run in parallel” in parallel! Let’s parallelize part of each sub-pipeline:

  • Sub-pipeline Step 1: Build the library with Go 1.x
  • Sub-pipeline Step 2: Run these actions in parallel: [Run half the tests, Run the other half of the tests]

Now we’re running parallel pipelines for Go 1.8 and 1.9 and parallelizing the tests within each pipeline. It’s two different levels of parallelism made possible by a single generic mechanism.

Conclusion

A simple toolbox of good abstractions is often more usable than a similarly powerful collection of specialized language features. With the right presentation, this simplicity can be easier to learn as well. CI in particular has a lot to gain from adding expressiveness without increasing complexity.

--

--