Automating content creation for docs
By Adriaan van der Feltz, Automation Specialist, Adyen
Adyen has a team of over 140 developers, with several code commits being made each minute, across more than seven languages, and 25+ products. Creating all our documentation manually put a lot of strain on our internal resources, so recently, we looked at ways to automate and scale content creation.
The great thing about documenting source code is that all information is already present in the code. The problem was how to unlock it. Our approach was to use existing tools, which individually are widely used and accepted, and put them together in a modular way to scale content creation.
1. Choosing the right tools
While JavaDoc-style code references are common documentation sources for developers, they are far from intuitive. But our tables and code samples needed to be presented in a clean and easy-to-read manner, and consistent with the look and feel of our documentation. Furthermore, code references needed to be embedded in the docs that describe how to use them, and ideally accompanied by samples/snippets that implement the referenced code.
3. How to update relevant changes to the product in docs
With such a large body of code, finding the relevant changes that affect developers integrating with Adyen was a significant challenge. Automating updates was the aim, but in such a way that back end or trivial commits don’t need to trigger documentation updates.
4. Quality and security
Generating content automatically removes manual work, but quality checks still need to be in place. For example, do changes in code snippets need to be reflected elsewhere in the text? And more critically — are there any security risks? We had to find a balance between automation, quality, and security.
As our challenges were quite complex, we split the process into three phases.
Phase 1: Use Doxygen to push code to the CMS
Doxygen is a tool that takes annotated/commented source files from code repositories, and generates code references in HTML or an offline reference manual. In the first phase, we set up Doxygen to generate an HTML specification and push the output to our CMS through its API.
We also needed a way to compare the changes between versions on a ‘functional’ level — highlighting what developers integrating with Adyen need to know and disregarding what is not important. For this we did a simple diff process to compare and categorize what parts of the specification where newly added, deleted, edited or remained the same between versions. Next, the corresponding, human-created documentation needed to be updated. Following this, our HTML specification pages were parsed by our program and converted into objects, ready to push to our CMS via its API.
Phase 2: Trigger documentation automatically
In Phase 2, we wanted to automate this process. We introduced Jenkins, a continuous integration server, into the stack. Running our project via Jenkins provided an automated start for documentation output. Jenkins ensures that code could be automatically pulled from our code repository whenever one of our developers makes a commit.
Whenever Jenkins detects a commit to the code base, it checks out a new copy of the latest version for the specific documentation project. Next it retrieves archived specifications corresponding to the penultimate version. Finally, it triggers the process set up in Phase 1. Ultimately, we set up a dedicated Jenkins job for each product project that requires documentation.
Phase 3: Create formatted documentation
This was a great start in terms of automating the process, but it was no use if we couldn’t ensure that our code snippets and tables weren’t presented clearly. Doxygen is open source, so we were able to build our own custom version in which we could change the template files, according to which the HTML specification is generated. Also, we automatically included macros to further improve the look and feel of our docs including such things as page properties, code blocks, and so on.
Furthermore, much of our documentation is use-case based. We created use-case based output by generating a subset of our full docs filtered by the JSON message defining the use case. This process is now very fast and easy — we simply upload the JSON file of the use case to the code repository, which automatically triggers the whole process to generate a new page on our docs website, with only the information relevant to that specific case.
200 pages worth of documentation, which would have previously taken weeks of work to create manually, were generated in 15 seconds flat.
The content automation initiative has drastically reduced the amount of manual work required by the documentation team. In one particular example, 200 pages worth of documentation, which would have previously taken weeks of work to create manually, were generated in 15 seconds flat. Even better, this was created without any chance of human errors, and in a manner that can be updated easily. This approach is helping us to scale docs and keep our team resources available for bigger projects, while enabling developers integrating with Adyen to be confident in the knowledge that all our code samples and tables are up-to-date.
Working at Adyen
I joined Adyen as part of the Mechanical Master Program in January 2017 (read about my colleague Rick’s experience here: Internship experience: Improving point-of-sale with passive learning). My project was to automate the translation process across our point-of-sale terminals in 26 languages, which reduced a time-consuming manual review process that involved up to 60 people, to a quick and easy process with no manual review. Following the program, I joined the documentation team as an automation specialist, to look at ways to take the manual work out of content creation.
We are always on the lookout for the next generation of engineers and technical people to help us build the infrastructure of global commerce. If you are interested in finding out more, check out our vacancies: https://www.adyen.com/careers/vacancies/development