One of the metrics defined in the DORA State of DevOps report is Lead Time for changes. This is the time it takes for a change to be developed and deployed into production.
With packaged software there is additional lead-time in terms of creating media (disks etc.) and packaging. However with cloud-based software there are no physical constraints to deploying changes (although there may still be regulatory constraints or other reasons why a change may be delayed).
Defining “Lead time for changes”
If we want to improve our Lead time for changes metric, we first need to measure this time. As software engineers our changes to the system are captured in source control prior to releasing. The THG warehouse management systems team uses a git-branch based workflow where each branch signifies a single (hopefully desirable) change to the software.
True lead time should be the total time from inception to release — as we don’t capture the inception of a change in a consistent fashion, the team decided on the start of a change to be defined as:
The datetime of the first commit to a branch that is ultimately associated with a Jira Issue Number & Pull Request in GitHub
This eliminates prototype branches, dead end changes and other anomalies that can occur during normal day-to-day development.
A second metric that the team wanted to measure was the time a branch was in-flight before being merged into a release version. This is important as there are occasions where there’s no technical reason why a change should be delayed, however a decision is made not to release the change — tracking the time from branch creation to merge into a release allows us to plot branch time vs branch delivery time as independent values. These two measurements would allow us to see any improvements to our two key software delivery metrics.
Calculating branch lifetime and delivery time
Using a local git clone of the repository we initially need to work out which commits exist between one tag or release and the next tag. To begin with we need to determine a common ancestor or merge-base for the start tag and the end tag. We achieve this on the command line using:
git merge-base v1 v2
Then we need to inspect the git log to get all commits from this common ancestor to the commit that is referenced by the end tag — once again on the command line:
git log <v1>..<v2>
Finally for each of the commits listed in this log output we need to calculate the branch lifetime for which we need the merge time for each branch and the first commit time for the branch.
We can get the merge time (time that the feature branch was merged back into the release tag) from manipulating the log output:
git log --pretty=format:%h,%at,%s <v1>..<v2>
To get the first commit time to the branch (our proxy measure for when ‘work was started’) we need to call the Github API to fetch this information. When we have both times, a simple subtraction will give us the lifetime of a branch.
Now that we have determined what the input data is and how to get the data from the command line, all that’s left is to automate the process so that we start generating these statistics for each release that is deployed to production.
This was previously scripted in Python where interactions with git used
subprocess to shell out and call the exact commands needed. However I decided that this initial prototype would be better reimplemented in Rust.
Collection of commits
The first thing to accomplish (beyond initial project setup with a
Cargo.toml file); generate a list of commits that define the set of changes from one tagged release to the next.
To interact with the git repository I’m using the libgit2 bindings for Rust. There is no explicit ‘log’ type, instead to get a facsimile of the behaviour from the git command line, I use
In the snippet above I pass in a reference to the
to and the
Repository. With Rust it is important to distinguish the ownership of a variable, with
from: &str being a reference to the thing not the thing itself. This is not the same as a pointer in C despite the similar surface syntax. In Rust, references:
allow you to refer to some value without taking ownership of it
revwalk I set the
to and then use
merge_base to find the common ancestor. Finally I call
.map to process each of the objects between the common ancestor and the end commit and convert them into
git2::Commit types after getting the actual
Commit I hand it off for further processing to convert into a string for output. There is some adjustment prior to executing the ‘walk’. I need to hide the initial
from or the log will contain an additional commit — this is just a small idiosyncrasy of the libgit2 bindings at play.
Aside: Failing gracefully
Failure Is Not an Option: Mission Control from Mercury to Apollo 13 and Beyond
Buy Failure Is Not an Option: Mission Control from Mercury to Apollo 13 and Beyond Unabridged edition by Gene Kranz…
The code starts by converting the tag names into
git2::Object types. The conversion returns a
Result<Object,Error> which we need to handle. To handle the possibility of failure, Rust provides several options:
- Don’t care about the error case and just
unwrap_orand it’s derivatives
The simplest method is to propagate the error out of the function. That is, if your function definition also returns a
Result<T,Error> then you can simply
? on each call that returns a
Result<T,Error> and if it fails your function will fail.
let f = repo.revparse_single(from)?;
This pattern is repeated throughout this code, with the calling code then needing to either; also return
Result<T,Error> or use
match and handle both the happy and unhappy paths.
After gathering the correct collection of
Commits, the function
commit_to_formatted_output is used to generate the output data that is destined for a CSV file:
This function extracts some data from the
Commit struct and then calls two other functions. The first simply finds a trailing Pull Request reference from the commit message. Using Rust’s pattern matching, the reference is used to gather more information from github to calculate the total branch time before returning the string. In the case where there is no Pull Request associated with the commit, we simply return the string
pr_number and the total branch time.
To get the correct branch time we call the Github API to get the first commit date & time for the branch associated with this pull request and then subtract that value (as seconds since unix epoch) from the merge timestamp passed into this function.
To handle the date/time parsing we use Chrono which allows us to
parse a string as a
DateTime<Utc>. The awesome Reqwest library is used to fetch the json from Github and Serde is used to convert the json response into structs:
After parsing the commit log, decorating the
Commit objects with additional info from each Github Pull Request and converting into the appropriate output format, we just need to tie this all together:
To handle the command line arguments, Rust has multiple high-quality libraries: Docopt, Clap and args. We chose to use Docopt to keep the code similar to the Python prototype implementation which also used (the Python version of) Docopt.
We also need to lookup a Github API personal access token — for ease the code uses an environment variable lookup given that this token shouldn’t be stored anywhere in source control and the goal is to run this on a build server which we can inject the correct environment variable value into securely.
Given an environment variable
GITHUB_STATS_TOKEN set to a personal access token with the appropriate permissions, the command to run the built binary is:
branch-time <path to local git repo> <github repo> origin/release/v1 origin/release/v2
An example of the the output CSV file is shown here:
From the generated output it is trivial to calculate the delivery time — or time from branch creation to branch merged into a release and given that we know precisely when each branch is deployed.
Lead time for delivery of a single change is calculated as: branch_time_for_change + time for release branch to be deployed.
We can also use this data as the foundation for plotting a time series of median or average lead time for delivery (along with max/min). Plotting these is left as an exercise for the reader…
We can also plot a count of branches deployed without a corresponding Pull Request — any of the changes which have
unknown for a PR or a
branch_time (ie. branches that may have bypassed a code review step or branches missing Jira issue numbers).
Rust vs Python
Beyond creating a tool to provide the team with a means to measure and track software delivery performance, I was personally interested in how the Rust version compared to the original Python prototype.
The prototype accomplished the goal of measuring the branch times, so why bother with the rewrite? We needed to install this on a legacy server with an older version of Python and git installed — there’s also the fact that you shouldn’t really deploy prototypes into production…
When we compare the original prototype with the Rust version, there are some interesting characteristics. The first is that the Rust version is a little shorter than the Python version for a couple of reasons:
- Python version contains more debug output and logging — it’s a prototype after all
- Python version posts a message to Slack instead of writing a CSV file — this requires a little more code (although only a few lines more)
So perhaps it’s unfair to compare the code in terms of just “lines of code” (LOC)? True, the Rust version has the advantage of being written second, however the Rust version also has to use structs for type-safety for the results from the Github API (something we can ignore in Python as we can interpret the results as a
dict) and in the Python version the heavy lifting of creating the log is handled via a call out to git directly. On the other hand, the Rust version also includes tests in the total LOC count.
The Rust version is more complex due to working with libgit2 at a lower-level than the Python prototype that called the normal git commands and interpreted the results. The
get_commit_log function is significantly more work in Rust than in Python. The rest of the code is nearly identical in terms of overall structure — although there are a few differences due to different idioms and practices.
Which of the two solutions would be considered more complex from a different engineers point-of-view, is however the crux when it comes to being maintainable going forwards for the team? Rust has an inherent complexity with respect to having to deal with memory management that a Python program doesn’t have. So all else being equal, the Python version is preferable.
Deployability — or ease of installation
Python’s popularity is a double-edged sword. Python is used by OS vendors to build system tools — in particular package management tools (eg. yum). This leads to conflicts around installing different libraries for your scripts vs OS-native Python.
Rust, however, emits as standard a single binary executable — the downside is that you must compile it with the correct output architecture (for most users this would be x86_64 uarch).
Other modern languages take the same approach — Go also compiles to a single binary which contains all required libs (resulting in a larger binary but avoids dynamic linking issues).
Of course, the golden hammer of Docker and containers exists for all languages, however the added complexity of running a small tool inside a container vs running the tool directly is probably not worth the effort.
The goal of the exercise was to be able to calculate a metric that could be used by the team to drive improvements around stalls in delivering changes into production. Once we can see how long each change takes to get delivered we can dig deeper and see if long-lived branches and changes share some common features:
- Do they involve cross-cutting concerns?
- Are they poorly specified?
- Did the work get redone?
- What do the comments on the Pull Request review look like?
The initial calculation was prototyped as a Python script which produced the required output but had undesirable interactions with the target deployment environment (ie. Python library conflicts and issues with incorrect versions of git).
A rewrite using Rust has allowed the team to calculate the same values but deploy more easily as a statically-linked binary while maintaining a similar enough structure to the original code to be understandable to non-Rust programmers in the team.
Complete Rust code for calculating these delivery metrics is available here.
Find out about the exciting opportunities at THG here: