GitHub Actions: Reusability, DRY Principle, Debugging and Fast Feedback

Piotr Kleban
13 min readJun 25, 2023

In this article, we will explore some method of workflow debugging and create a reusable workflow.

The DRY principle stands for Don’t Repeat Yourself, and it is a principle of software development that aims at reducing the repetition and code duplication. DRY principle can also apply to workflows and can be relatively easily implemented in Github Actions to avoid duplication.

We will first recap the essentials of GitHub Actions, explore methods of workflow debugging. Then, we will create a composite action to lint Terraform code and compare it with reusable workflows. Finally, we will create a simple reusable workflow to run testing of Terraform modules in parallel.

Table of contents:

· Core concepts
· Developing and debugging
act tool
gh tool
Enabling debug logging in Github
· Implementing a simple Composite Action
· Reusable workflows for parallel linting of Terraform code
· Reusable workflow for parallel testing of Terraform modules
· Running module tests in parallel with Github Actions matrix property
Dynamically building a matrix
· Conclusions

Before we dive into the details of creating and using Actions, let’s first revisit the main concepts of GitHub Actions and how they work together.

Core concepts

Workflows are automated processes that you can configure to run one or more jobs when something happens in your repository, like a pull request, an issue, or a push.

Jobs are sets of steps that perform a specific function. Jobs can run one after another or at the same time, depending on how you configure your workflow.

Steps are individual commands or actions that run as part of a job. Steps can be shell scripts or reusable extensions called actions. Data can be shared between steps.

Actions are reusable extensions that can simplify your workflow by encapsulating common tasks or logic. You can use actions from the GitHub Marketplace, from public repositories, or from your own repositories. Example here.

Contexts are objects that have information about the workflow run, the machine environment, the repository, the secrets, and the inputs. You can use contexts to get data and pass it between different steps in your workflows.

There are 3 types of Actions:

  • Container actions run inside a Docker container.
  • JavaScript actions run directly on the runner machine and can use Node.js and the GitHub Actions toolkit for the development.
  • Composite actions combine multiple workflow steps within one action. Composite actions can use shell scripts or other actions as steps. Composite actions are useful for combining together common tasks or logic that can be reused in different workflows.

To create an action, you need to provide a metadata file that defines the parameters and the main logic for your action saved into a action.yaml. The parameters include the inputs that your action accepts, and the outputs that your action produces. The main logic is the entrypoint for your action.

“The metadata filename must be either action.yml or action.yaml"

The runs property define how an action is executed. Parameters ofruns depend on the type of action, which is specified by the run.using property. There are three possible types of actions: docker, nodeXX(for example node16) and composite.

JavaScript actions

For runs.using equal to node16 we have e.g.:

  • main: The name of the JavaScript file that contains the action code.
runs:
using: 'node16'
main: 'index.js'

Another parameter supported by runs for JavaScript Actions are: runs.pre, runs.pre-if, runs.post, runs.post-if. They work as additional hooks executed before/after main action.

Composite action

For runs.using equal to composite we have only steps parameter:

  • steps: An array of steps to run as part of the action.
runs:
using: 'composite'
steps:
- run: echo Hello ${{ inputs.who-to-greet }}.
shell: bash

Docker action

For runs.using equal to docker we have e.g.:

  • image: The name of the Dockerfile that defines the container image for the action.
  • entrypoint: An array of arguments to pass to the Dockerfile entrypoint when the container starts.

For example:

runs:
using: 'docker'
image: 'Dockerfile'
pre-entrypoint: 'pre-entrypoint.sh'
entrypoint: 'entrypoint.sh'

Another parameter supported by runs for Docker Actions are runs.pre-entrypoint, runs.env,runs.post-entrypoint,runs.args.

In the next sections, we will dive deeper into the process of creating the composite action.

After this brief introduction, let’s see how we can make developing and debugging our GitHub workflows easier.

Developing and debugging

You might have experienced the frustration of waiting for your workflow, only to find out that it failed due to a simple mistake. It can take several iterations of changing your code, committing to GitHub and running pipeline to until the bug is fixed. That’s why act is a convenient tool that lets you run your Actions locally. Act provides faster feedback and provides powerful control over your workflow execution, saving you time.

act tool

Common parameters:

  • --quiet : This flag enables quiet output, which suppresses most of the logs and only shows the summary of the workflow run
  • --verbose : This flag enables verbose output, which shows more details about the actions and steps being executed
  • -l flag is used to list the workflows and their JobID or events
  • -W flag is used to specify the path to the directory containing the workflow files; -W flag can point to a specific file
  • -j flag is used to specify a job to run
  • -s flag is used to specify a secret to make available to the workflow
  • -w / --watch watch for change

Examples :

# Load secrets from a file:
act --secret-file my.secrets
# This sets the secret MY_SECRET to the value foo.
act -s MY_SECRET=foo

For example, we can set GITHUB_TOKEN to authenticate requests to the GitHub API. The GITHUB_TOKEN is a special type of secret that is automatically created by GitHub. It is scoped to the current repository and has a limited set of permissions.

When you use act, the GITHUB_TOKEN is not automatically generated or provided by GitHub, so you need to set it manually using the -s flag. You can create a personal access token (PAT) from your GitHub settings and use it as the GITHUB_TOKEN value.

Here is an example of how to use act with GITHUB_TOKEN:

$ act -s GITHUB_TOKEN=123456789abcdef

In order to run main workflow, we can use:

act -W ./.github/workflows/main.yml

gh tool

Once we tested our workflow locally (e.g., by act) and it seems to be working, we can commit it to your GitHub repository to see the status in a real environment.

Tool which simplify that -gh can perform many GitHub operations from a terminal, without needing to switch to the web browser, like:

  • gh workflow list : to list the workflows.
$ gh workflow list
WORKFLOW ID STATE CREATED
CI 123 active 1 month ago
Deploy 456 active 2 weeks ago
  • gh workflow view <workflow-name> : to see the details of a workflow (list of runs, its ID).

And, for example to see YAML file of the workflow:

gh workflow view main.yml --yaml
  • gh run watch : shows runs in progress
  • gh run view : shows the details of a workflow run

--log: Display the log for a run.

--log-failed: Display the log only for the steps that failed

-v or --verbose: verbose output

-j: Display output for specific job ID

-w or --web: Open the run in the web browser. This will launch your default web browser and open the workflow run page on GitHub.com.

Enabling debug logging in Github

Sometimes, the default workflow logs may not give you enough information to troubleshoot a problem with your workflow, job, or step. For example, you may want to see the values of some variables, the output of some commands, or the details of some API calls. In such cases, you can enable debug logging to get more verbose and detailed logs.

There are two variables: ACTIONS_RUNNER_DEBUG and ACTIONS_STEP_DEBUG which make logging more verbose.

We can set them in:

Now that we have explored how to debug and simplify the development of GitHub actions, let’s proceed and create an example of a Composite Action.

Implementing a simple Composite Action

Let’s suppose we have different projects that use Terraform modules. We want to create reusable actions that can test and lint our Terraform code, so we don’t have to write the same pipeline for each project. We can store these actions in a separate repository and use them in our workflows that use terraform modules. This way, we can save time across our projects.

In order to have stable environment, it is crucial to use tags for the action itself and for the tools used within the steps:

“We strongly recommend that you include the version of the action you are using by specifying a Git ref, SHA, or Docker tag number. If you don’t specify a version, it could break your workflows or cause unexpected behavior when the action owner publishes an update.”

Let’s now create a directory structure for two actions:

.
├── .github
│ └── workflows
│ │ └── main.yaml
│ └── actions
│ ├── test
│ │ └── action.yaml
│ └── lint
│ └── action.yaml

...

Now, let’s look at a possible composite action for running Terraform static analysis (lint/action.yaml):

name: 'Static Analysis'
description: 'Run static analysis'
inputs:
checkov_version:
description: 'Checkov version'
default: "2.3.245"
tfsec_version:
description: 'tfsec version'
default: "1.28.1"

runs:
using: "composite"

steps:
- name: Test with Checkov
uses: bridgecrewio/checkov-action@v12.1346.0
with:
framework: terraform
soft_fail: true
output_format: sarif
output_file_path: sarif_dir
version: ${{ inputs.checkov_version}}

- name: tfsec
uses: aquasecurity/tfsec-sarif-action@v0.1.4
with:
tfsec_version: ${{ inputs.tfsec_version }}
soft_fail: true
sarif_file: tfsec.sarif
  • inputs are the parameters that you can pass to a composite action when you use it in a workflow.
  • uses are the steps that use other Actions.

Also, if we want to upload the results of a static analysis tool to GitHub, we can add another job step to automate this process. One of the actions that you can use is the codeql-action/upload-sarif action. This action takes a SARIF file as an input and uploads it to GitHub:

SARIF stands for Static Analysis Results Interchange Format. SARIF is a standard format for the output of static analysis tools, which makes it easier to share results across different tools and platforms.

- name: Upload terrascan SARIF file
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: <directory_path>/<file_path>

Next, let’s reference lint.yaml composite action in main.yaml workflow:

jobs:
static-analysis:
runs-on: ubuntu-latest
name: Static Analysis
steps:
- uses: actions/checkout@v3
- uses: ./.github/actions/lint
  • uses: ./.github/actions/lint — in this example it’s a local action that you we have defined in your repository. We can use remote repository as well.

There is a one problem — when we run our workflow, we can immediately see that steps in lint.yml are going to execute steps one by one, and there is no way to change that by any of the Composite Action properties. In other words, we cannot create multiple jobs in a composite action (so we cannot run our linters in parallel). We need to use something else from GitHub Actions.

Limitation of composite action: steps cannot be run in a parallel

We can create reusable workflows that can have multiple jobs, by using a different feature: on: workflow_call. This feature enables you to invoke a workflow from another workflow.

Reusable workflows for parallel linting of Terraform code

Now, let’s convert our lint.yaml action into a reusable workflow:

on:
workflow_call:
inputs:
checkov_version:
description: 'Checkov version'
type: string
default: "2.3.245"
tfsec_version:
description: 'tfsec version'
type: string
default: "1.28.1"

jobs:
checkov:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3
- name: Test with Checkov
id: checkov
uses: bridgecrewio/checkov-action@v12.1346.0
with:
framework: terraform
output_format: sarif
output_file_path: sarif_dir
version: ${{ inputs.checkov_version }}
- name: Upload terrascan SARIF file
uses: github/codeql-action/upload-sarif@v2
if: success() || failure()
with:
sarif_file: sarif_dir/results_sarif.sarif

tfsec:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3
- name: tfsec
uses: aquasecurity/tfsec-sarif-action@v0.1.4
with:
tfsec_version: ${{ inputs.tfsec_version }}
sarif_file: tfsec.sarif
- name: Upload terrascan SARIF file
uses: github/codeql-action/upload-sarif@v2
if: success() || failure()
with:
sarif_file: tfsec.sarif

The workflow has two jobs: one for Checkov and one for tfsec. Each job runs in parallel, performing the following steps:

  • Checkout the repository
  • Run the scan tool and generate a SARIF file with the results
  • Upload the SARIF file to GitHub Code Scanning

To call a reusable workflow from main.yaml workflow (locally), you need to use the uses keyword just pointing to the right file:

jobs:
lint:
uses: ./.github/workflows/lint.yaml

To reference reusable workflow from another project use:

jobs:
lint:
uses: {owner}/{repo}/.github/workflows/lint.yaml@v1

We can observe that both jobs are running simultaneously:

Reusable workflow for parallel testing of Terraform modules

Once we have implemented parallel linting with different tools, we will move on to the next step, which is creating a simple example of a workflow that can run Terraform module tests in parallel.

To use parallel testing with GitHub Actions, we need to use a special syntax called matrix strategy. This syntax allows us to define multiple values for one or more variables, and then run a job for each combination of those values.

One of the ways to test modules is by using tftest is a Python package that provides a testing framework for Terraform modules.

With tftest, users can write different types of tests, such as tests that only use Terraform init and plan to ensure code is syntactically correct and the right number and type of resources should be created, or full-fledged tests that run the full apply / output / destroy cycle, and can then be used to test the actual created resources, or the state file. Users can leverage pytest fixtures to set up and tear down test environments or run multiple tests with different parameters.

The other popular test framework is terratest written in Go.

In this example, we have a Terraform files that create an ALB module and a S3 module. Directory modules/alb/tests and modules/s3/tests contains our test cases. File .github/workflows/main.yaml will just reference test.yaml and lint.yaml pipelines defined in another repository containig reusable workflows.

├── .github
│ └── workflows
│ └── main.yaml
├── modules
│ ├── alb
│ │ ├── main.tf
│ │ ├── outputs.tf
│ │ ├── tests
│ │ │ ├── test_alb.py
│ │ │ └── requirements.txt
│ │ ├── variables.tf
│ │ └── versions.tf
│ ├── s3
│ │ ├── main.tf
│ │ ├── outputs.tf
│ │ ├── tests
│ │ │ ├── test_s3.py
│ │ │ └── requirements.txt
│ │ ├── variables.tf
│ │ └── versions.tf
└── README.md

Here is a simple example of main.yaml.

jobs:
lint:
uses: <org><repo>/.github/workflows/lint.yaml@v1
run-tests:
uses: <org><repo>/.github/workflows/test.yaml@v1

We just add a new job and link it to the workflow from another repository by uses keyword. So executing tests can be simple as just adding test.yaml in a new job.

Workflow repository:

├── .github
│ └── workflows
│ ├── lint.yaml
│ └── test.yaml
└── README.md

Let’s now define a basic an example of test.yaml

Running module tests in parallel with Github Actions matrix property

We will take advantage of the matrix feature that we have previously mentioned.

Dynamically building a matrix

We can use jobs.<job_id>.strategy.matrix to define a matrix of different job configurations. The matrix field accepts value or an expression. Expression can provide either a single value (such as a string) or a complex object (using fromJson).

However, if you want to use a script output as an input to matrix, you need to use a workaround. In our case a script will just return a list of directories containig module tests in Json format. One possible solution is to generate a JSON in one job (find-test-dirs) and pass it to the second job (test) using outputs

jobs:
find-test-dirs:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3
- name: Find test directories
id: set-dirs
run: |
dirs=$(find ./ -name "test" -or -name "tests" -type d \
-type d \
-not -path '*/.terraform/*' \
| jq --raw-input --slurp 'split("\n")[:-1]')
echo "dirs=$(echo $dirs)" >> $GITHUB_OUTPUT
outputs:
dirs: ${{ steps.set-dirs.outputs.dirs }}
  • run will get as the paths of all directories that have the name test or tests, except those that contain .terraform and represent them as a Json

So dirs variable in our example will contain the following JSON content:

[
"./modules/s3/test",
"./modules/alb/tests"
]
  • echo "dirs=$(echo $dirs)" >> $GITHUB_OUTPUT — this is a way to set an output variable for a GitHub action step.
  • outputs is a field that defines the output variables for a job. We can use these variables in other jobs that depend on this job (set via needs).

Next, we are going to add dependency by using need: find-test-dirs

test:
runs-on: ubuntu-latest
needs: find-test-dirs
strategy:
matrix:
dirs: ${{fromJson(needs.find-test-dirs.outputs.dirs)}}
steps:
- uses: actions/checkout@v3
- name: Set up Python 3.11
uses: actions/setup-python@v4
with:
python-version: "3.11"
- name: Install dependencies
working-directory: ${{ matrix.dirs }}
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
- name: Test with pytest
working-directory: ${{ matrix.dirs }}
run: |
pytest
  • needs: find-test-dirs is a field that tells GitHub actions that this job depends on another job called find-test-dirs. This means that this job will only run after the find-test-dirs job has completed successfully.
  • matrix: is a field that defines the matrix of different configurations for this job. A matrix is a set of variables and their values that can be used in other parts of the job definition.
  • The fromJson in matrix is a function that converts the value from a JSON string to an array of strings. This means that the dirs variable will contain an array of directory paths that were found by the find-test-dirs job.
  • working-directory: ${{ matrix.dirs }} is a field that sets the working directory for the step to the value of the dirs variable from the matrix. This means that each instance of this job will run the pytest command in a different directory from the array.

So, test job will run multiple instances of pytest in parallel, each one testing a different directory that was found by the previous job.

In our examples we have two tests (s3, alb) directories, so workflow runs, as expected two parallel tests jobs :

Conclusions

GitHub Actions is a convenient and easy to set up tool for GitHub repositories. What is more, it does not require any hosting, so is perfect for open source projects.

By using Composite Actions and Reusable Workflows, we can follow the DRY principle and make our GitHub Actions more maintainable and reuse our existing pipeline code across different repositories.

Thanks

--

--

Piotr Kleban

Wizard of automation. Makes sure that code does not explode when it goes live. Obsessed with agile, cloud-native, and modern approaches.