DocOps: Continuous Integration / Consistent Documentation

Published in

ReversingLabs Engineering

6 min readJun 12, 2023

https://pixabay.com/illustrations/pipeline-water-flange-sample-1566044/

In a recent post, my colleague Kristijan discussed how we developed a way to automate publishing to Zendesk using their API and some clever Python scripting. In this post, I’ll go over some additional things that we’ve been doing and how we’ve fully transitioned into what some call DocOps. Expanding on the idea of docs as code, DocOps is a paradigm in which documentation gets the same CI/CD treatment as your product: automated tests, scheduled deployments, quick and easy fixes, and more.

Basics

All our documentation is stored as text files on our GitLab instance. Specifically, it’s reStructuredText. The reason why rST was chosen is mostly because you can easily .. include:: content in different files, and there’s also a lot of configurability and extensibility that Markdown doesn’t provide.

These text files get transformed into HTML using Sphinx, which then also undergoes an additional transformation into PDF using Weasyprint.

As software gets released, the documentation follows. Different products have different cycles and requirements, so it’s important to have a system where 2–3 people can maintain several thousand pages of complex technical content. Building documentation locally and running upload scripts worked well, but our newest addition is harnessing the power of GitLab CI/CD.

Continuous integration, continuous documentation

Let’s say you have a Python script that uploads your content to a predefined section using the Zendesk API. Locally, you would run it like this:

python update-zendesk.py

But a lot of complexity is hidden in this simple command. Here are some things you need to ensure:

Authentication. You will need to have a token. You probably don’t want to hard-code all your authentication info, so you’ll extract it into a token file. Depending on how the script works, you might need to pass this token file as a command line argument, or the script might look in a predefined place. In any case, that’s an additional layer of complexity that you somehow need to handle — locally.
Reproducibility. You want to be able to run the same script on different computers and get the same result. In practice though, this is harder. Sure, you could use a Python virtual environment (venv) — and that’s a good idea — but then that’s a policy you need to enforce. You also need to make sure that everyone uses the same venv, which means additional onboarding and additional internal documentation.
Availability and ease of use. It needs to be as simple as possible to build and publish documentation. If you get a brand new computer, you should have to install as few dependencies as possible (and ideally you should install nothing!)

So why not wrap all that complexity into a CI/CD workflow? In the rest of this blog, I’ll give an example of how we did this for our malware analysis product: A1000.

YAML-ing for fun and profit

Let’s set out our requirements:

Keep plain text files as source in our git repository.
Build HTML & PDF.

Then publish:

on Zendesk
on an internal documentation portal
on an internal file share (PDFs)
on an external file share (PDFs)

First, in our repo, we will define a .gitlab-ci.yml file and set up some stages:

image: python:latest

stages:
  - build
  - deploy

And then we’ll create some jobs, depending on what we need to do. We need to:

build a PDF document
build PDF release notes
publish on an internal documentation portal (we’ll use GitLab Pages for that)
publish HTML docs on Zendesk
publish HTML release notes on Zendesk (in a different section)
publish all the PDFs — this one is the most prolific section and will have four jobs (two for each file share x two types of document)

Let’s start with something easy — building PDF documentation and release notes. We’ll make use of make commands that we’ve inherited from previous workflows, and we’ll just include whatever already works into this new workflow.

build_pdf:
  stage: build
  script:
    - make docs_a1000
    - FILE_NAME="ReversingLabs A1000 ${DOCVERSION} - Product Documentation.pdf"
    - mv output/tcbase-a1000/docs/pdf/A1000-User-Guide.pdf "$FILE_NAME"
    - echo "FILE_NAME=$FILE_NAME" >> build_pdf.env
  artifacts:
    paths:
      - "ReversingLabs A1000 * - Product Documentation.pdf"
    reports:
      dotenv: build_pdf.env

After creating the PDF and renaming it according to the product version (which changes with every release), we’ll simply move it into a specific folder (in our case, the root folder), mark it as an artifact, and thus make it available to later jobs (those that handle publishing). This is a good setup because you don’t have to build twice or thrice — you build it once (or, once per commit), and then it stays available for whatever it is that you plan to do later.

Finally, the reports: this is so that later jobs also have access to the exact $FILE_NAME that we defined in this job. In GitLab, variables defined within jobs aren’t available to other jobs unless specifically passed down.

We’ll do the same thing for our release notes. We’ll name that job something like build_release_notes and follow the appropriate process for it.

Using GitLab Pages

We also need to publish the docs on an internal doc portal. For that, GitLab Pages is a great tool. We’ll create this job in the deploy stage, and we’ll just run our standard sphinx command.

pages:
  stage: deploy
  script:
    - sphinx-build -t full -t html-build sphinx/a1000 public/
  rules:
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
  artifacts:
    paths:
      - public

It’s as easy as that! The only thing that’s sort of unexpected is that GitLab has weird behavior around when the pipeline (or job) runs. By default, it should be only on commits to the master branch. However, there are also Merge request pipelines, which run jobs when a merge request is opened or updated. According to GitLab’s documentation, these pipelines do not run by default, however we found this not to be the case when some of our documentation for Pages got overwritten. You guessed it — it was overwritten by changes from an open MR. This is the reason why we are explicit in specifying rules: run this job only if the branch of the commit is the master branch. Someone has encountered this same problem, the bug has been reported, but in the meantime — don’t trust GitLab’s docs on this.

More publishing

Our internal file share is a Google Drive. We wrote a small Python script that takes in an argument — the PDF file from the build step — and simply updates a file with that same name (or creates a new one, if there isn’t one already). I won’t include the entire script, but here’s the relevant logic:

if FILE_NAME in file_names_id:
    FILE_ID = file_names_id[FILE_NAME]
    service.files().update(fileId=FILE_ID, media_body=media).execute()
    print(f'Updated existing file: {FILE_NAME}')
else:
    file_metadata = {'name': FILE_NAME, 'parents': [FOLDER_ID]}
    service.files().create(body=file_metadata, media_body=media).execute()
    print(f'Uploaded new file: {FILE_NAME}')

We simply run this script for every relevant repository. Instead of keeping duplicates, we have one “tools” repo which we temporarily clone to our “product” repositories when we run pipelines.

- git clone https://gitlab-ci-token:${CI_JOB_TOKEN}@example.com/tchw/tools
- cp tools/gdrive_upload.py .
- python gdrive_upload.py

The same logic applies to our external file share (Nextcloud) and our customer support portal (Zendesk): custom Python scripts, one of which was already described in a previous post. The essential upgrade here is unifying all of this scripture, and having it available straight from our source control.

A final word

We still haven’t added documentation tests, for example:

spell checking
link checking
build size comparison
general linting
code sample automated testing

All of the above are currently in the works and we plan to slowly roll them out and include them in a separate test stage.

Overall, the approach of using CI/CD workflows has the great advantage of being highly scalable — even if the scope of documentation grows by an order of magnitude, these systems allow a small team to handle it all. A GitLab Runner doesn’t really care if it’s building one HTML document or a hundred; the underlying server can easily handle intensive use of this type. That is, what would be a difficult task for a human can be easily automated away and scaled… well, maybe not indefinitely, but certainly by a lot. This way, the total productivity per techwriter grows significantly, but not only that: techwriters can focus on actually writing or otherwise improving documentation, creating new helpful content, pruning old and out-of-date information, instead of perpetually battling build issues or taking time out of the day to publicize things.

Overall, using CI/CD this way illustrates how quantity has a quality of its own. In other words, when you have a relatively small corpus of text to maintain, it helps, but isn’t a total game changer. But once you veer into the territory of lots and lots and lots of documentation, and also many different places where this documentation needs to end up — then using such an approach really shines.