Jenkins Pipelines: What I Wish I Knew Starting Out

A couple months ago, I took on setting up a continuous deployment process for a new application. The other applications at Managed by Q were already being deployed using Jenkins pipelines, a way to combine multiple jobs or steps in a single Jenkins item and visually show progress through the pipeline. Creating a new Jenkins pipeline seemed like the easiest way to get the new app deploying.

Having never written a Jenkins pipeline before, I thought the existing pipelines would guide me in writing a new pipeline and that creating a new Jenkins pipeline would be the easiest way to get the new app deploying. The existing pipelines did help. However, I was new to Jenkins, pipeline development and groovy which led to many problems that often weren’t easy to diagnose, some problems causing hours of pain before finding the solution.

Here I share the problems I encountered and tidbits of knowledge that, had I known when I began the project, would have saved me many hours and some hair on my head. Perhaps it can save you some time and hair too.

Serialization and the “node” Block

Jenkins pipelines must be built so they can run across multiple executors, even if you only run Jenkins with one executor. This is done using the node block. With it, a job can be split into smaller pieces that can be scheduled to run on any executor on any agent in the cluster. Once a node block completes, Jenkins serializes the remaining variables to pass on to the next node block or to the intervening Groovy code.

Theoretically, this works seamlessly and is a great feature. And it does, when all the classes used in your script are serializable.

I’ve been bitten by using non-serializable classes multiple times during pipeline script development. I’ve also had scripts start failing after a Jenkins server upgrade that forced installation of a new Java version and classes that were once serializable lost the ability to be serialized.

The solution is to find alternative classes that are serializable. Sometimes there is a drop-in replacement, like when the JsonSlurper class changed to return LazyMap instead of HashMap in Groovy 2.3 and a new JsonSlurperClassic class was added to support the original behavior (ref).

In-process Script Approval

The Script Security Plugin provides a way to limit which Java and Groovy classes and methods a non-privileged user or script can use. This is great from a security and control perspective. When developing a new pipeline script, this can easily get in your way.

Here is an example stack trace when using the toArray method in the java.util.Collection class and it hasn’t been whitelisted. It shows up at the end of the Console Output for the pipeline.

org.jenkinsci.plugins.scriptsecurity.sandbox.RejectedAccessException: unclassified field java.util.Collection toArray
at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.unclassifiedField(SandboxInterceptor.java:348)
at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onGetProperty(SandboxInterceptor.java:344)
at org.kohsuke.groovy.sandbox.impl.Checker$4.call(Checker.java:241)
at org.kohsuke.groovy.sandbox.impl.Checker.checkedGetProperty(Checker.java:238)
at com.cloudbees.groovy.cps.sandbox.SandboxInvoker.getProperty(SandboxInvoker.java:23)
at com.cloudbees.groovy.cps.impl.PropertyAccessBlock.rawGet(PropertyAccessBlock.java:17)
at Script1.mostRecentReleaseTag(Script1.groovy:156)
at WorkflowScript.run(WorkflowScript:14)

To approve the use of the field / method / class, you’ll either need administrative access or to ask someone with administrative access to perform the approval. To get there, from the main Jenkins page go to the Manage Jenkins page, click on the In-process Script Approval link and click on the Approve button next to the target field / method / class.

Errors like this also show up when attempting to access a field or method that doesn’t exist for a class. For example, there is no length field in java.util.ArrayList (the length is given by the size method).

String Interpolation

Groovy allows the use of both single quotes and double quotes for denoting strings (or triple single quotes and triple double quotes for multi-lined strings). The primary difference between single and double quoted strings is the ability for double quoted strings to interpolate any Groovy expressions within the string (like bash, zsh, etc.).

def name = 'Bob'
def greeting = "Sally said \"Hi ${name}!\""
assert greeting.toString() == 'Sally said "Hi Bob!"'

Logging Output From Shell Scripts

It’s really easy to execute a shell script (or Python/Ruby/… script) from within a build pipeline script. Just use the sh command. For example:

sh 'echo "Hello Shell!"'

But getting the output from that command is not directly supported. Here’s a little hack to get the output:

import java.util.UUID
def shellCommandOutput(command) {
def uuid = UUID.randomUUID()
def filename = "cmd-${uuid}"
echo filename
sh ("${command} > ${filename}")
def result = readFile(filename).trim()
sh "rm ${filename}"
return result
}

With that function, you can now do things like use curl to query the GitHub API for the latest commit SHA on a branch.

def latest_sha = shellCommandOutput("""
curl -H 'Authorization: token ${env.GITHUB_API_TOKEN}' \
-L \
'https://api.github.com/repos/jlinder/${repo}/branches/${branch}'\
| jq -r '.commit.sha'
"""
)

Accessing Secrets

My build pipeline scripts call a bunch of APIs and for each one they need a secret of some type. There is a Jenkins plugin, the Credentials Binding Plugin, that gives pipeline scripts access to username/password pairs, secret text, certificates and other credentials stored in Jenkins.

Once the plugin is installed, the withCredentials() function can be used to temporarily put secrets into environment variables. An example of how to do this:

def withMyCredentials(body) {
withCredentials([
[$class: 'StringBinding', credentialsId: 'github-deploy-token',
variable: 'GITHUB_API_TOKEN']
],
body)
}
// Tag a release
withMyCredentials {
def gitOutput = shellCommandOutput("""
curl -H 'Authorization: token ${env.GITHUB_API_TOKEN}' -data \
'{\"ref\": \"refs/tags/${tagName}\", \"sha\": \"${gitSha}\"}' \
-L 'https://api.github.com/repos/jlinder/${repo}/git/refs'
""")
}

Minimal Error Information

When using the sh command to run a script, if the script fails and doesn’t write any info to stdout/stderr about why it failed, the build output will report something like:

ERROR: script returned exit code 1
Finished: FAILURE

In our pipeline, there are often many things that happen between a failed script and the end of the output. The way I worked around not being told which script had failed is to make sure all the scripts in the pipeline print additional error information before exiting with a failure code. Then, a simple search for Error turns up the script (or scripts) that failed.

Commands Requiring a “node” Block

Some commands seem to require being run within a node block. For example, git and sh. This makes using them in reusable functions a little tricky because sometimes the calling context is already running from within a node block and other times it isn’t. To get around this, we created an ensureNode() function (super hacky).

def ensureNode(body) {
def nodeNeeded = false
try {
sh “pwd”
} catch(e) {
nodeNeeded = true
}
    if (nodeNeeded) {
node body
} else {
body()
}
}

This generally works. However, when this is run outside a node context, the error thrown by the sh “pwd” line ends up in the logs adding many lines of noise and including error information. When there are multiple occurrences of this in a pipeline, it makes it harder to find the source of the real problem.

In Closing

The journey to build a reusable Jenkins pipeline for our apps was full of glitches and hiccups (building a hack to get command line output, ensuring node blocks are used at the right times, serialization restrictions and figuring out what to do about it, …). I also found many things I don’t like about how the pipelines work, such as:

  • Errors reported on failed builds are often misleading or tell nothing about how to actually fix the problem
  • So many times it was easier to just shell out to accomplish a goal than to do it within Groovy, a pattern that appears to be endorsed by the community; it feels like the process is duct-taped together
  • Changing pipeline steps removes the view of all the previous runs of the pipeline

Jenkins pipelines also bring an entirely new development stack to our team. Few of our team members know Java and the only Groovy any of us knows is what we’ve learned during pipeline development. The documentation about the basics is ok. Beyond the basics, particularly for edge cases and errors, the documentation is poor. This makes the barrier to entry higher than desired.

All that said, now we have a reusable, continuous deployment pipeline for any dockerized application. Setting up a new project with it takes only a couple hours for someone unfamiliar with the process. The old way took more than a day for someone familiar with the process and days for those new to the process. This translates into being able to move faster and to deploy easily and often.

The pipeline mostly fits our current backend deployment needs and can grow further with us. We have improvements on our radar include refactoring the code to python, consolidating many scripts into fewer scripts, and to build a pipeline for frontend deploys. However, our vision of how we want our deploy system to work is different than what Jenkins pipelines allows us to do. I expect we’ll move on from Jenkins pipelines in time, though it is not clear when we’ll do it.