GitHub Actions: Reusability, DRY Principle, Debugging and Fast Feedback
In this article, we will explore some method of workflow debugging and create a reusable workflow.
The DRY principle stands for Don’t Repeat Yourself, and it is a principle of software development that aims at reducing the repetition and code duplication. DRY principle can also apply to workflows and can be relatively easily implemented in Github Actions to avoid duplication.
We will first recap the essentials of GitHub Actions, explore methods of workflow debugging. Then, we will create a composite action to lint Terraform code and compare it with reusable workflows. Finally, we will create a simple reusable workflow to run testing of Terraform modules in parallel.
Table of contents:
· Core concepts
· Developing and debugging
∘ act tool
∘ gh tool
∘ Enabling debug logging in Github
· Implementing a simple Composite Action
· Reusable workflows for parallel linting of Terraform code
· Reusable workflow for parallel testing of Terraform modules
· Running module tests in parallel with Github Actions matrix property
∘ Dynamically building a matrix
· Conclusions
Before we dive into the details of creating and using Actions, let’s first revisit the main concepts of GitHub Actions and how they work together.
Core concepts
Workflows are automated processes that you can configure to run one or more jobs when something happens in your repository, like a pull request, an issue, or a push.
Jobs are sets of steps that perform a specific function. Jobs can run one after another or at the same time, depending on how you configure your workflow.
Steps are individual commands or actions that run as part of a job. Steps can be shell scripts or reusable extensions called actions. Data can be shared between steps.
Actions are reusable extensions that can simplify your workflow by encapsulating common tasks or logic. You can use actions from the GitHub Marketplace, from public repositories, or from your own repositories. Example here.
Contexts are objects that have information about the workflow run, the machine environment, the repository, the secrets, and the inputs. You can use contexts to get data and pass it between different steps in your workflows.
There are 3 types of Actions:
- Container actions run inside a Docker container.
- JavaScript actions run directly on the runner machine and can use Node.js and the GitHub Actions toolkit for the development.
- Composite actions combine multiple workflow steps within one action. Composite actions can use shell scripts or other actions as steps. Composite actions are useful for combining together common tasks or logic that can be reused in different workflows.
To create an action, you need to provide a metadata file that defines the parameters and the main logic for your action saved into a action.yaml
. The parameters include the inputs that your action accepts, and the outputs that your action produces. The main logic is the entrypoint for your action.
“The metadata filename must be either
action.yml
oraction.yaml"
The runs
property define how an action is executed. Parameters ofruns
depend on the type of action, which is specified by the run.using
property. There are three possible types of actions: docker
, nodeXX
(for example node16
) and composite
.
JavaScript actions
For runs.using
equal to node16
we have e.g.:
- main: The name of the JavaScript file that contains the action code.
runs:
using: 'node16'
main: 'index.js'
Another parameter supported by runs
for JavaScript Actions are: runs.pre
, runs.pre-if
, runs.post
, runs.post-if
. They work as additional hooks executed before/after main
action.
Composite action
For runs.using
equal to composite
we have only steps
parameter:
steps
: An array of steps to run as part of the action.
runs:
using: 'composite'
steps:
- run: echo Hello ${{ inputs.who-to-greet }}.
shell: bash
Docker action
For runs.using
equal to docker
we have e.g.:
image
: The name of the Dockerfile that defines the container image for the action.entrypoint
: An array of arguments to pass to the Dockerfile entrypoint when the container starts.
For example:
runs:
using: 'docker'
image: 'Dockerfile'
pre-entrypoint: 'pre-entrypoint.sh'
entrypoint: 'entrypoint.sh'
Another parameter supported by runs
for Docker Actions are runs.pre-entrypoint
, runs.env
,runs.post-entrypoint
,runs.args
.
In the next sections, we will dive deeper into the process of creating the composite action.
After this brief introduction, let’s see how we can make developing and debugging our GitHub workflows easier.
Developing and debugging
You might have experienced the frustration of waiting for your workflow, only to find out that it failed due to a simple mistake. It can take several iterations of changing your code, committing to GitHub and running pipeline to until the bug is fixed. That’s why act
is a convenient tool that lets you run your Actions locally. Act provides faster feedback and provides powerful control over your workflow execution, saving you time.
act tool
Common parameters:
--quiet
: This flag enables quiet output, which suppresses most of the logs and only shows the summary of the workflow run--verbose
: This flag enables verbose output, which shows more details about the actions and steps being executed-l
flag is used to list the workflows and theirJobID
orevents
-W
flag is used to specify the path to the directory containing the workflow files;-W
flag can point to a specific file-j
flag is used to specify a job to run-s
flag is used to specify a secret to make available to the workflow-w / --watch
watch for change
Examples :
# Load secrets from a file:
act --secret-file my.secrets
# This sets the secret MY_SECRET to the value foo.
act -s MY_SECRET=foo
For example, we can set GITHUB_TOKEN
to authenticate requests to the GitHub API. The GITHUB_TOKEN
is a special type of secret that is automatically created by GitHub. It is scoped to the current repository and has a limited set of permissions.
When you use act
, the GITHUB_TOKEN
is not automatically generated or provided by GitHub, so you need to set it manually using the -s
flag. You can create a personal access token (PAT) from your GitHub settings and use it as the GITHUB_TOKEN
value.
Here is an example of how to use act
with GITHUB_TOKEN
:
$ act -s GITHUB_TOKEN=123456789abcdef
In order to run main
workflow, we can use:
act -W ./.github/workflows/main.yml
gh tool
Once we tested our workflow locally (e.g., by act
) and it seems to be working, we can commit it to your GitHub repository to see the status in a real environment.
Tool which simplify that -gh
can perform many GitHub operations from a terminal, without needing to switch to the web browser, like:
gh workflow list
: to list the workflows.
$ gh workflow list
WORKFLOW ID STATE CREATED
CI 123 active 1 month ago
Deploy 456 active 2 weeks ago
gh workflow view <workflow-name>
: to see the details of a workflow (list ofruns
, itsID
).
And, for example to see YAML file of the workflow:
gh workflow view main.yml --yaml
gh run watch
: shows runs in progressgh run view
: shows the details of a workflow run
--log
: Display the log for a run.
--log-failed
: Display the log only for the steps that failed
-v
or --verbose
: verbose output
-j
: Display output for specific job ID
-w
or --web
: Open the run in the web browser. This will launch your default web browser and open the workflow run page on GitHub.com.
Enabling debug logging in Github
Sometimes, the default workflow logs may not give you enough information to troubleshoot a problem with your workflow, job, or step. For example, you may want to see the values of some variables, the output of some commands, or the details of some API calls. In such cases, you can enable debug logging to get more verbose and detailed logs.
There are two variables: ACTIONS_RUNNER_DEBUG
and ACTIONS_STEP_DEBUG
which make logging more verbose.
We can set them in:
Now that we have explored how to debug and simplify the development of GitHub actions, let’s proceed and create an example of a Composite Action.
Implementing a simple Composite Action
Let’s suppose we have different projects that use Terraform modules. We want to create reusable actions that can test and lint our Terraform code, so we don’t have to write the same pipeline for each project. We can store these actions in a separate repository and use them in our workflows that use terraform modules. This way, we can save time across our projects.
In order to have stable environment, it is crucial to use tags for the action itself and for the tools used within the steps:
“We strongly recommend that you include the version of the action you are using by specifying a Git ref, SHA, or Docker tag number. If you don’t specify a version, it could break your workflows or cause unexpected behavior when the action owner publishes an update.”
Let’s now create a directory structure for two actions:
.
├── .github
│ └── workflows
│ │ └── main.yaml
│ └── actions
│ ├── test
│ │ └── action.yaml
│ └── lint
│ └── action.yaml
...
Now, let’s look at a possible composite action for running Terraform static analysis (lint/action.yaml):
name: 'Static Analysis'
description: 'Run static analysis'
inputs:
checkov_version:
description: 'Checkov version'
default: "2.3.245"
tfsec_version:
description: 'tfsec version'
default: "1.28.1"
runs:
using: "composite"
steps:
- name: Test with Checkov
uses: bridgecrewio/checkov-action@v12.1346.0
with:
framework: terraform
soft_fail: true
output_format: sarif
output_file_path: sarif_dir
version: ${{ inputs.checkov_version}}
- name: tfsec
uses: aquasecurity/tfsec-sarif-action@v0.1.4
with:
tfsec_version: ${{ inputs.tfsec_version }}
soft_fail: true
sarif_file: tfsec.sarif
inputs
are the parameters that you can pass to a composite action when you use it in a workflow.uses
are the steps that use other Actions.
Also, if we want to upload the results of a static analysis tool to GitHub, we can add another job step to automate this process. One of the actions that you can use is the codeql-action/upload-sarif
action. This action takes a SARIF file as an input and uploads it to GitHub:
SARIF stands for Static Analysis Results Interchange Format. SARIF is a standard format for the output of static analysis tools, which makes it easier to share results across different tools and platforms.
- name: Upload terrascan SARIF file
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: <directory_path>/<file_path>
Next, let’s reference lint.yaml
composite action in main.yaml
workflow:
jobs:
static-analysis:
runs-on: ubuntu-latest
name: Static Analysis
steps:
- uses: actions/checkout@v3
- uses: ./.github/actions/lint
uses: ./.github/actions/lint
— in this example it’s a local action that you we have defined in your repository. We can use remote repository as well.
There is a one problem — when we run our workflow, we can immediately see that steps in lint.yml
are going to execute steps one by one, and there is no way to change that by any of the Composite Action properties. In other words, we cannot create multiple jobs in a composite action (so we cannot run our linters in parallel). We need to use something else from GitHub Actions.
Limitation of composite action: steps cannot be run in a parallel
We can create reusable workflows that can have multiple jobs, by using a different feature: on: workflow_call
. This feature enables you to invoke a workflow from another workflow.
Reusable workflows for parallel linting of Terraform code
Now, let’s convert our lint.yaml action into a reusable workflow:
on:
workflow_call:
inputs:
checkov_version:
description: 'Checkov version'
type: string
default: "2.3.245"
tfsec_version:
description: 'tfsec version'
type: string
default: "1.28.1"
jobs:
checkov:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3
- name: Test with Checkov
id: checkov
uses: bridgecrewio/checkov-action@v12.1346.0
with:
framework: terraform
output_format: sarif
output_file_path: sarif_dir
version: ${{ inputs.checkov_version }}
- name: Upload terrascan SARIF file
uses: github/codeql-action/upload-sarif@v2
if: success() || failure()
with:
sarif_file: sarif_dir/results_sarif.sarif
tfsec:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3
- name: tfsec
uses: aquasecurity/tfsec-sarif-action@v0.1.4
with:
tfsec_version: ${{ inputs.tfsec_version }}
sarif_file: tfsec.sarif
- name: Upload terrascan SARIF file
uses: github/codeql-action/upload-sarif@v2
if: success() || failure()
with:
sarif_file: tfsec.sarif
The workflow has two jobs: one for Checkov
and one for tfsec
. Each job runs in parallel, performing the following steps:
- Checkout the repository
- Run the scan tool and generate a SARIF file with the results
- Upload the SARIF file to GitHub Code Scanning
To call a reusable workflow from main.yaml
workflow (locally), you need to use the uses
keyword just pointing to the right file:
jobs:
lint:
uses: ./.github/workflows/lint.yaml
To reference reusable workflow from another project use:
jobs:
lint:
uses: {owner}/{repo}/.github/workflows/lint.yaml@v1
We can observe that both jobs are running simultaneously:
Reusable workflow for parallel testing of Terraform modules
Once we have implemented parallel linting with different tools, we will move on to the next step, which is creating a simple example of a workflow that can run Terraform module tests in parallel.
To use parallel testing with GitHub Actions, we need to use a special syntax called matrix strategy. This syntax allows us to define multiple values for one or more variables, and then run a job for each combination of those values.
One of the ways to test modules is by using tftest is a Python package that provides a testing framework for Terraform modules.
With tftest
, users can write different types of tests, such as tests that only use Terraform init and plan to ensure code is syntactically correct and the right number and type of resources should be created, or full-fledged tests that run the full apply / output / destroy cycle, and can then be used to test the actual created resources, or the state file. Users can leverage pytest
fixtures to set up and tear down test environments or run multiple tests with different parameters.
The other popular test framework is terratest written in Go.
In this example, we have a Terraform files that create an ALB module and a S3 module. Directory modules/alb/tests
and modules/s3/tests
contains our test cases. File .github/workflows/main.yaml
will just reference test.yaml
and lint.yaml
pipelines defined in another repository containig reusable workflows.
├── .github
│ └── workflows
│ └── main.yaml
├── modules
│ ├── alb
│ │ ├── main.tf
│ │ ├── outputs.tf
│ │ ├── tests
│ │ │ ├── test_alb.py
│ │ │ └── requirements.txt
│ │ ├── variables.tf
│ │ └── versions.tf
│ ├── s3
│ │ ├── main.tf
│ │ ├── outputs.tf
│ │ ├── tests
│ │ │ ├── test_s3.py
│ │ │ └── requirements.txt
│ │ ├── variables.tf
│ │ └── versions.tf
└── README.md
Here is a simple example of main.yaml.
jobs:
lint:
uses: <org><repo>/.github/workflows/lint.yaml@v1
run-tests:
uses: <org><repo>/.github/workflows/test.yaml@v1
We just add a new job and link it to the workflow from another repository by uses
keyword. So executing tests can be simple as just adding test.yaml in a new job.
Workflow repository:
├── .github
│ └── workflows
│ ├── lint.yaml
│ └── test.yaml
└── README.md
Let’s now define a basic an example of test.yaml
Running module tests in parallel with Github Actions matrix property
We will take advantage of the matrix
feature that we have previously mentioned.
Dynamically building a matrix
We can use jobs.<job_id>.strategy.matrix
to define a matrix of different job configurations. The matrix
field accepts value or an expression. Expression can provide either a single value (such as a string) or a complex object (using fromJson
).
However, if you want to use a script output as an input to matrix
, you need to use a workaround. In our case a script will just return a list of directories containig module tests in Json format. One possible solution is to generate a JSON in one job (find-test-dirs
) and pass it to the second job (test
) using outputs
jobs:
find-test-dirs:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3
- name: Find test directories
id: set-dirs
run: |
dirs=$(find ./ -name "test" -or -name "tests" -type d \
-type d \
-not -path '*/.terraform/*' \
| jq --raw-input --slurp 'split("\n")[:-1]')
echo "dirs=$(echo $dirs)" >> $GITHUB_OUTPUT
outputs:
dirs: ${{ steps.set-dirs.outputs.dirs }}
run
will get as the paths of all directories that have the nametest
ortests
, except those that contain.terraform
and represent them as a Json
So dirs
variable in our example will contain the following JSON content:
[
"./modules/s3/test",
"./modules/alb/tests"
]
echo "dirs=$(echo $dirs)" >> $GITHUB_OUTPUT
— this is a way to set anoutput
variable for a GitHub action step.outputs
is a field that defines the output variables for a job. We can use these variables in other jobs that depend on this job (set vianeeds
).
Next, we are going to add dependency by using need: find-test-dirs
test:
runs-on: ubuntu-latest
needs: find-test-dirs
strategy:
matrix:
dirs: ${{fromJson(needs.find-test-dirs.outputs.dirs)}}
steps:
- uses: actions/checkout@v3
- name: Set up Python 3.11
uses: actions/setup-python@v4
with:
python-version: "3.11"
- name: Install dependencies
working-directory: ${{ matrix.dirs }}
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
- name: Test with pytest
working-directory: ${{ matrix.dirs }}
run: |
pytest
needs: find-test-dirs
is a field that tells GitHub actions that this job depends on another job calledfind-test-dirs
. This means that this job will only run after thefind-test-dirs
job has completed successfully.matrix:
is a field that defines the matrix of different configurations for this job. A matrix is a set of variables and their values that can be used in other parts of the job definition.- The
fromJson
inmatrix
is a function that converts the value from a JSON string to an array of strings. This means that thedirs
variable will contain an array of directory paths that were found by thefind-test-dirs
job. working-directory: ${{ matrix.dirs }}
is a field that sets the working directory for the step to the value of thedirs
variable from the matrix. This means that each instance of this job will run the pytest command in a different directory from the array.
So, test
job will run multiple instances of pytest in parallel, each one testing a different directory that was found by the previous job.
In our examples we have two tests (s3, alb) directories, so workflow runs, as expected two parallel tests jobs :
Conclusions
GitHub Actions is a convenient and easy to set up tool for GitHub repositories. What is more, it does not require any hosting, so is perfect for open source projects.
By using Composite Actions and Reusable Workflows, we can follow the DRY principle and make our GitHub Actions more maintainable and reuse our existing pipeline code across different repositories.
Thanks