Continuous Integration on CodeChain

SeongChan Lee
CodeChain
Published in
12 min readJun 11, 2019
https://commons.wikimedia.org/wiki/File:BMW_Leipzig_MEDIA_050719_Download_Karosseriebau_max.jpg

Everything changes during the development process. Different requirements may emerge, or the development environment may change, and the code must be adjusted accordingly. As code regularly becomes modified and altered, there must be steady ongoing management and investment in order to keep the code from breaking apart. Without diligent care and effort may lose formatting consistency, and failed test cases may be left unattended. In the worst case, compile errors are accumulated over long periods of time behind build flags that are not used very often, and ultimately lead to build errors. If the codebase continues to break in this manner, it will have adverse effects on the productivity of the development team, and ultimately, affect the quality of the entire product. For a blockchain project that stresses utmost reliability, it is crucial to maintain proper coding practices.

It is important to maintain your code in good shape by enforcing practices, such as strengthening code reviews, regularizing code formatting, running test cases on a regular basis, and building code with various build flag combinations. However, it is not possible to get rid of the responsibilities of this repetitive and constant work to each individual developer. Putting it in the hands of a human being not only leads to errors, but it will also increase the fatigue of the developers, lower efficiency, and gradually become a big mess.

Therefore, it is important to automate tools and workflows related to code maintenance. The CodeChain Core project takes advantage of a variety of software and services that enables the code to stay in the best condition. At the time of the pull request creation in the Github repository, Travis CI automatically performs software that performs various checks and reports the results to inform the developer of the flaws of the code. Mergify works with CI and the review system to automate the tasks associated with Github’s pull requests, such as branch updates and merges, which are repetitive, cumbersome and often delayed, reducing the team’s overall workload.

Automation of the Merge Process Using Mergify

The CodeChain Core project used strict status checks on pull requests merged into the master branch, and used the rebase merge as the default merge method. With strict status checks, pull requests must be up-to-date with the branch you want to merge in order for them to be merged. This is because even if the rebase or merge commit creation was successful when you attempted the merge, the resulting code might be incorrect code. For example, let’s imagine a situation where you try to merge branches A and B sequentially into the master branch. In branch A, code that uses the function foo was added, and in branch B, foo was removed from the code. In this case, merge A merge is successful, and merge B is successful as well, but after B merges, a compilation error will occur. Therefore, if you create a rebase or a merge commit, you should run the test again and apply it to the master branch only when there are no problems.

In the CodeChain Core project, which uses this approach, in order for pull requests to be merged into the master branch, the following workflow has to be followed:

  1. The contributor creates a pull request.
  2. Wait for Travis CI’s test results.
  3. Anyone specified as a code reviewer must review the code.
  4. When all reviews are finished and Travis CI doesn’t report any errors, the merge process will begin.
  5. If the branch of a pull request is fast-forwardable to the master, or if the merge commit is up-to-date with the branch’s latest commit, it will attempt rebase merging.
  6. If it is not up-to-date, the branch is updated to master by using rebase or Github’s Update Branch function and returns to step 2.

However, there is a drawback to this workflow. The merge can only be done on one pull request that has been updated to the master, and once the updated pull requests have been merged into the master, the remaining pull requests must be updated to be merged again. If you make a mistake to update other pull requests and merge them first, instead of merging the already updated pull requests first, you need to refresh the pull requests that were updated recently. Fewer mistakes are made in situations with fewer open pull requests. However, as the number of open pull requests increases and the test by Travis CI begins to become delayed, mistakes become more common. Frequent branch updates increase the number of tasks that are trapped in Travis CI, and because the number of concurrent tasks that Travis CI can run is limited, the work of other pull requests become gradually delayed as well. To avoid this situation, the one who merges the pull requests must remember carefully which pull requests have been updated and merge sequentially.

In order to reduce the burden on the merger in this situation and the workload on Travis CI to reduce the merge delay, and to consequently improve the efficiency of the team, the CodeChain engine team introduced Mergify. Mergify is a merge management service that works with Github. You can set the policy on the .mergify.yaml file included in the repository, and you can proceed or stop the automatic merging of the pull request according to the set policy. Furthermore, various actions can be performed automatically.

Here is the .mergify.yml rule that we enforce on our team: (link)

pull_request_rules:
- name: Merge when CI passes and resolves all requested reviews
conditions:
- "#approved-reviews-by>=1"
- "#review-requested=0"
- "#changes-requested-reviews-by=0"
- status-success=continuous-integration/travis-ci/pr
- base=master
- label!=do-not-merge
- "- title~=\\b(wip|WIP)\\b"
actions:
merge:
method: rebase
rebase_fallback: null
strict: smart
The workflow of Mergify

This rule allows Mergify to automatically merge pull requests that passes the following conditions:

  1. Has more than 1 reviewer, and is approved by all the reviewers.
  2. Passes Travis CI.
  3. Does not have ‘WIP’ in the title.
  4. Does not have ‘do-not-merge’ in the label.

We also tried merge rebasing only when merged, and we have set the strict merge setting to ON.

If you use the strict merge setting in Mergify, it will automatically update the pull request that satisfies the condition to the master branch. However, I have already explained why it is important to keep the update and merge order consistent. If Mergify randomly updates and tries to merge, there will still be a delay in the workload on Travis CI. So use the strict: smart option. With the strict: smart option, Mergify manages the work queue so that one pull request is updated and merged at a time.

Code Inspection and Test Automation Using Travis CI

Mergify is only responsible for automating the process of merging pull requests on the basis of reviews and results from Travis CI, and it is up to Travis CI to check for possible problems in the code. Once Github has created a pull request, it will automatically create a build job to check that pull request on Travis CI by working with Github. The jobs that are created can be defined in the .travis.yml file in the repository’s root directory. In the CodeChain Core project, two kinds of jobs are defined. Checking the formatting of the code and checking for minor mistakes, verifying that CodeChain is being built on the target platform to which CodeChain is deployed, and doing the unit tests and the end-to-end tests. Jobs defined in CodeChain can be found here.

CodeChain Core projects are written in the Rust programming language. We use the code formatter, rust-fmt, to check the formatting of the Rust code, and use the lint tool, rust-clippy, to check for minor mistakes in the code. Unit tests are run using the cargo test, which naturally attempts to compile and detect compile errors. The E2E test is written in TypeScript, a script used to test code written in Rust, but this code is also subject to management. We use a formatter called ‘prettier’ to check the formatting, and minor mistakes in the code are checked by the lint tool called ‘tslint’. The E2E test is written and executed with the mocha test framework.

Reducing Execution Time for Unnecessary Tasks in Travis CI

Travis CI

Since the CodeChain project is not a small one, it takes a long time to complete tests. The CodeChain binaries written in Rust alone take tens of minutes to build, and the E2E test takes up to 30 minutes to run a variety of tests, waiting for the consensus on the binaries built identically to the actual operating environment. If you do all this work in a row, it takes about an hour and 20 minutes, and once a pull request is created, these operations are performed once in the pull request branch when the pull request was first created, once again after it has been merged into master, and one last time in the master branch. Performing this process takes up to 4 hours. The free plan Travis CI offers to open source projects limits the number of concurrent tasks to four, so the total working time available for an 8 hour work day is only 8 * 4 = 32 hours. It’s like saying you can only create up to eight pull requests per day. Due to code changes from peer reviews and new jobs created by work-in-progress pull requests, this number is greatly reduced and the pull request checks were gradually being pushed away. We needed to optimize the time it takes to do this jobs.

We have optimized the process so that the unnecessary work done in Travis CI can be finished as early as possible. Travis CI provides a way to define jobs conditionally, but because of the limited information available and DSL’s low degree of freedom, a more flexible shell script was used to finish the defined jobs earlier. The types of jobs that we finish early using a shell script are:

  1. If the pull request contains only changes to documents that do not affect the build, tests are not performed.
  2. If only E2E tests are edited, no unit tests are performed.
  3. Any merge tests performed on the master branch immediately after it is automatically merged into master by mergify are not performed. This is because due to the strict merge settings and the rebase merge policy, the updated code from the pull request via Mergify is identical to the master branch code that finished the rebase merge process.

It is easy to write a shell script that checks for these conditions and immediately aborts the job. However, in a special situation called Travis CI, additional considerations were necessary to create a recyclable script. In order for Mergify to automatically merge, Travis CI should be reported as “successful” even if the task is finished early, and because no bug-free code exists, the shell script also stops the task if it encounters an unexpected error, and the shell script should be able to log and verify what command was executed.

It was relatively simple to log the shell script and abort it immediately in cases of failure. The bash shell has several flags that can be turned on or off with the set command, which is solved by turning on the e flag and the v (or x) flag. If you turn on the e flag with set -e, an error occurs in the command in the shell script. If a nonzero exit code is encountered, the shell is immediately stopped and the exit code of the last failed command is considered as the exit code of the shell. If the command listed in .travis.yml is terminated with a non-zero exit code, Travis CI will abort the operation and report “Failed.” Turning on the v flag with set -v will print out what commands were executed in the shell script. If you turn on the x flag instead of v, you will see an expanded set of variables as they were passed as command arguments.

However, it was not possible within a shell script to stop Travis CI’s operations while it is “successful.” Commands such as exit 0 and travis_terminate 0 stop the shell itself when called from within a shell script, but did not stop the operations of Travis CI. We had to return the values of either “shut down” or “resume” jobs from the inside to the outside. In order to return values from a sub-shell, there are usually two ways to do this: one is to return the exit code, and the other way is to print the output to standard output and capture the standard output from the parent shell. However, the shell’s exit code was already being used to determine if the Travis operation has “failed”or not , and the standard output was what ordinary commands in the shell scripts were using to log the situation.

However, there is one more thing Travis CI logs in addition to the standard output, known as a standard error. In a Unix type operating system, running a process creates three standard streams that interact with the outside world: the standard error specified in file descriptor 0, the standard output specified in 1, the standard error specified in 2. In the case of an interactive shell, a standard input is used to receive input from the keyboard, and the standard output is used to output text to the terminal screen. The standard error is used to output a program error in a stream that can be differentiated from the standard output. Travis CI also logs the standard error stream.

Our scripts could use this point to return values from the subshell and log the command execution history in Travis CI. The Bash shell has powerful I / O(Input and Output) redirection capabilities. You can redirect the I / O to individual commands running within the shell, but you can also control the redirection of the I / O across the shell. We redirected all standard output (fd 1) in the shell to standard error out-of-shell fd 2 (fd 2), and redirected fd 3, which is output from within the shell, to standard output(fd 1) by using the following Bash command:

exec 3>&1 1>&2

The shell can capture the standard output of the subshell like RESULT = $ (./ script) and save it to a variable. If FD3 is redirected to standard output, it can be used as a stream to pass return values such as echo skip> & 3 echo noskip> & 3. We used this to run a shell script that determines if Travis CI’s .travis.yml stops working early, and with the return value, call the exit 0 command itself. This allows one to stop the Travis CI operation early, leaving it in the “Success” state.

./.travis/check-foo
#!/usr/bin/env bash
set -ex;
exec 3>&1 1>&2
function return_to_travis { echo $1 >&3; }
if some_condition_foo; then
echo “Skipped!”;
return_to_travis “skip”;
else
echo “Don’t Skip!”
return_to_travis “noskip”
fi
.travis.yml
RESULT=$(./.travis/check-foo); if [ “$RESULT” = “skip” ]; then terminate_travis 0; fi;

If you put these together, the shell script and the Travis CI script are as shown above. Through batch redirection of all standard output of common commands to standard error stream, you do not have to redirect individual commands but write your shell as you normally would if you wrote a shell script. This leaves trace logs in Travis CI and by turning on the set -e flag, a shell script can be stopped if it encounters an unexpected error within a command, and can be stopped early with a “success” status report if desired.

In this article, we introduced how we are utilizing CI to maintain a healthy state of the CodeChain Core project and to improve the team’s productivity. Mergify automates the workflow of pull requests on Github and reduces the effort of developers, and various inspection tools running on the Travis CI detect the anomalies of the code at the pull request level to prevent the error from flowing into the code base at all. In addition, Travis CI’s operations are configured to perform verification operations only when necessary, making full use of our limited resources. Our CI environment did not look like this from the beginning. By making development progress while keeping Github as a focus point, and resolving productivity issues step by step along the way, it was possible to evolve into the current state we are at today. Our projects will continue, and our code and environment will continue to change. Accordingly, our CI environment will also be steadily evolving to improve productivity.

--

--