Automation workflow for image assets using Github Actions

Yeray David Rodriguez Domínguez
edataconsulting
Published in
8 min readSep 22, 2021

--

Art Matters
Neil Gaiman.

When we think about automatization workflows or CI/CD, we usually think about code. But in millions of software applications, the code is not alone: desktop applications have icons, webs front-ends have image assets and videogames have tons and tons of sprites, backgrounds, or textures.

And being in some cases a crucial part of the software, image assets are usually neglected and forgiven in DevOps operations. At most, a CI/CD workflow takes the image files from some obscure repository (maybe Dropbox or an NFS) and simply copies them to the final production folder in which they are to be stored. No CVS, no optimization, no processing… nothing. And if one of the graphic designers makes a mistake, maybe the wrong logo will be uploaded to the public folder without anyone realizing it.

Let’s unleash all the power of ImageMagick and GitHub actions to make a full-fledged image management workflow that performs all these actions:

  • Check which images have changed in the repository, and for all of them:
  • 1. Check its metadata to guarantee that it is a valid asset
  • 2. Process it to a standard format and size, if necessary
  • Generate a contact sheet, with which any mistaken image can be easily spotted
  • Generate an index HTML with additional information for further validation
  • Upload all processed images to the repository
  • Generate a downloadable artifact to easily check the processed files

I’m sure that you have understood it, but let me attach a nice diagram, just to make the story prettier:

Let’s see how we implement it, step by step:

About GitHub actions

As you probably know, GitHub is an extremely popular code hosting and version control platform. It also provides a fully integrated automation and CI/CD system called GitHub Actions.

As both hosting and automation features are completely free for open source projects, it seems a good choice for our workflow. If you wish to know more about GitHub Actions, this is a good place to start with.

Our example workflow

Every mixed team (with developers and artists) will have its own methods and protocols, but in this example, we will assume a general workflow that can be applied to almost every kind of project but can be easily modified to fit particular cases.

  • The artists will provide the raw assets in PSD format, as it is a well-defined, lossless format that supports transparency and can store metadata in both IPTC and EXIF metadata. As it also supports layers, our code will be prepared to ignore them and use the flattened image of the file.
  • The raw art can be stored in any size, but the workflow will process it to a particular predefined size in case that it is larger.
  • The final format of the asset will be decided by a particular keyword (“sprite”) in the metadata information. If it exists, the format will be PNG with alpha channel. Otherwise, it will be JPEG (lossy)
  • The destination folder of the processed art will be different from the raw art folder, but in the same relative path (this is: a raw asset in raw/scene01 folder will be processed and stored in the src/scene01 folder)

Our tools

  • We will use the tj-actions/changed-files GitHub Actions plugin to get the list of files that have been modified since the last push or pull request, and the test-room-7/action-update-file plugin to upload the modified files back to the repository.
  • We will use ImageMagick, a popular and powerful image processing software, to perform all the image assets’ operations and contact sheet generation.
  • And of course, we will use the checkout action provided by GitHub to get the repository files, and the upload artifact action to provide the users convenient direct access to the updated assets folder.

Get the list of images to be processed

The very first step in the process is to get the image files that need to be processed. We will check out the repository using the standard GitHub action, and then, ask for the changed (modified or new) files using the tj-action/changed-files action:

In this first part of the workflow, we also define the default branch over which we are going to perform all the operations (main)

We will also define two environment variables that will be useful in further steps: MAX_SIZE, which specifies the maximum size for the processed art assets, and PROCESS_ALL, a boolean flag we can use to force the processing of all image assets. As it can be seen, we assume the images to get are all stored in the raw folder (probably organized in sub-folders).

Now we have to prepare the actual array of all the files the which are to be processed, using the list returned by the tj-action/changed-files action, or, if the PROCESS_ALL flag is set to true, the folder files list returned by the find bash command. This is done at the very beginning of the next step in the workflow:

Read image information and metadata

And now, let’s begin the actual image work!

We will skip the images stored in a particular subfolder (extra in our example). In this folder, we will store images that won’t be actually used by the application itself, like marketing, documentation, or reference pictures.

For the rest of the images, we will retrieve some physical and metadata information about the image files, and perform some integrity tests:

As you can see, we use the magick identify command to retrieve metadata. It is not easy to know what code is linked with each particular metadata, but you can use the following Image Magick command:

magick identify -verbose imagefile

to get (among MANY other data) all the metadata stored in the file and know which IPTC or EXIF code is associated with the metadata you are interested in. In this Gist you can see what kind of output is produced by this command.

Remember that for some layered image formats (like .PSD) you have to put [0] suffix after the image file name (i.e: sample.psd[0])

Set filename, folder, and destination folder

The following step is to decide where we are going to put our processed file.

We will use the same basename, but the extension will be decided according to the existence of the sprite keyword (alpha channel compatible PNG format will be used for sprites, lossless JPG for non-sprite files). For the destination folder, we will use the source folder but replacing raw with src/assets.

All of this is done using bash commands:

Actual image processing

Each file is processed in one of the following ways:

  • If the source file is bigger than the desired destination size, it will be resampled using Image Magick, and the output of this process will be stored with the filename and format decided in the previous step.
  • If there is no need to resize, but the source format is different from the destination format, it also must be processed with Image Magick to perform the conversion.
  • If the image is equal or smaller than the desired destination size, and the source and destination format is the same, we just copy the file.

Cool! We’re done… or are we?

It is important that any new file created during a GitHub Actions workflow execution is not automatically added to the repository. It is only created in the virtual machine that runs the commands and will disappear after the workflow is done.

So, we have to make those new processed files persistent. We will do it in two different ways:

  • We will generate and upload an artifact storing all the files in the output file. This way, on the workflow execution page, anyone can download it and put it in their local copy.
  • We will update the repository with a regular commit GIT operation so that the processing action appears in the Version Control history.

First bonus: Contact Sheet generation

Up to this point, the workflow works as intended: it takes a bunch of raw images and processes them to be consumed by our application.

But, would it not be nice to have some resource with which the artists can quickly browse all the generated art? This also could be used to detect inconsistencies or duplications, without downloading the repository or painfully checking all the filenames.

A Contact Sheet is a perfect tool for it. It comes from the analogic era of photography and consists of a collection of miniatures for fast browsing.

A nice analogic example of Contact Sheet by Ggia. Creative Commons License 3.0 Share-Alike

Image Magick makes it very easy to generate the Contact File. We just put all the files in the assets tree into a temporary folder, and we execute the montage ImageMagick tool. Please refer to its documentation to know more about each one of the parameters.

Second bonus: HTML index generation

But, what if a project uses hundreds or thousands of image assets? The Contact Sheets will have a very unpractical aspect ratio, and without text search, browsing through it would be very hard.

That’s when the HTML index comes to help. It is a regular HMTL file with references to the image assets, as in a classical web image gallery. The CSS is embedded to increase portability, and the metadata is also inserted for easy search.

We must not forget that everything created by the workflow is not persistent in the repository. We must commit the contact sheet and the HTML index:

How can we browse this HTML? If your local copy of the repository is updated, you can double-click on the file and it will be open with your default browser.

You can also preview it directly on the remote repository using this trick:

  • On the GitHub page of the repository, open the HTML file. The URL will be something like

https://github.com/my-org/my-repo/blob/main/raw/extra/art-index.html

https://htmlpreview.github.io/?https://github.com/…/art-index.html

When you load that URL, the Art Index will be loaded on your browser.

Packing it all together

The final YAML, which is too large to be embedded here, can be found here. Feel free to use it and modify it as needed.

This was used in a real-life project: Iakkai Saga — The Curse of Blood, a fantasy-packed Interactive Fiction for iOS devices. You can read more about this project in its repository.

Some final words

In the end, a software product is as good as its user perceives it, and a brilliant piece of engineering packed with features can be labeled as poor or flawed if its packaging (the UI and all graphic resources) has not been treated with care. This gets crucial with products like videogames or teaching materials, in which the images are intrinsic to the functionality.

Humans are fallible, an automatization workflow is not (at most, At most, its design is faulty — due to humans). Any operation that is delegated to it is a risk removed from the system, and we have plenty of tools to do it with the most typical (and boring) image processing tasks.

Because, don’t forget it, art matters.

--

--