Dokka behind the scenes — documenting multi-module projects

Published in

VirtusLab

6 min readMay 11, 2021

Documenting a multi-module project was always tricky due to the size, complexity and interconnections within such codebases. This is especially true in the case of Dokka, a documentation engine for Kotlin, performing the same function as Javadoc for Java. With 1.4.30 we’ve rewritten the multi-module documentation, so let’s dig a little deeper into the design and the thought process behind it!

What is multi-module documentation?

Consider a simple Gradle build with two subprojects:

Let’s then say that module Bhas a dependency on module A:

dependencies {    implementation(project(":moduleA"))}

This is for us a case of multi-module documentation since we have several separate but interconnected modules, and we’d like to document them as a whole project.

How multi module was treated previously

Originally our idea was to have a single (root) multi-module task that would invoke independent child tasks from the submodules and collect their results to generate a root entry page. Each module would need to have a Dokka Gradle plugin enabled and a markdown file that would contain module and package documentation. This documentation file would then be visible on the module page and on the all-modules page. This approach had a lot of advantages when it comes to concurrency, cache management and simplicity. All of those aspects would be delegated to Gradle and would allow for easy customization in the future since those were standard tasks.

What were the issues with it?

After the 1.4.0 release, it became clear that this solution had some shortcomings:

there is no way to link between modules other than creating an external documentation link;
it is hard to have a common search bar;
creating a link to a multi-module page requires some assumptions to be made.

Most of those issues were related to the fact that only the root multi-module task was aware of the existence of child modules during generation. From the perspective of the child task, everything was just like in a single module. As a result, there is no way to reliably link between two modules as you don’t even know that they exist. However, it is desirable to have unified documentation for all the parts in most cases, with all modules listed in the top-level page and a common navigation.

Creating a common search bar is slightly easier conceptually: after child tasks are finished, gather the results, merge them, and substitute every module separately. Unfortunately, when you actually introduce this solution, you quickly realize that Gradle scripts become convoluted with parsing, merging and saving logic. We weren’t very keen to keep this solution, so this PR was rejected as it would become unmaintainable quickly.

Finally, we wanted to make the Dokka logo redirect users to the main page. The main page would be the all-modules page in multi-module and the module page in a single-module generation. This is quite easy in the single module since we know where the module page is (we generate it in the same task), but the solution for the multi module is not so trivial, as during the generation we don’t know if we’re running in the multi-module context or not. Of course, we could just introduce a quite hacky solution by adding a flag like isMultiModule to the configuration but this would only help in the short term. We needed a better way of doing it

Default all-modules page in the initial solution

Package page in the initial solution. Notice the absence of the link to ClassA (it can only be added with an external documentation link) and only one module contents in the left-side navigation.

What we wanted the solution to look like?

Mainly we wanted to preserve the gist of the older approach: create a root and children tasks. This way, we delegate a lot of responsibility to Gradle itself, and we are not forced to reinvent the wheel.

Having this in mind, we started searching for solutions. After a long discussion, two proposals were selected:

generate a JSON file with Dokka’s Pages model in every child task, process it in the root multi-module task and render;
render as much as we possibly can without the knowledge of other modules to create a template and process it in the root multi-module task.

For those of you that are not familiar with Dokka’s architecture — Dokka utilises some intermediate representations of the code to allow for some abstraction. Pages describe how the rendered page should look like. For example, while creating a Page you can decide if the description should be above or below the signature.

On paper, the first approach looked way better since it allows plugin creators to have a single point where they can view a full model of the pages. Unfortunately, it also has some drawbacks:

some parts of the model are hard to serialize, eg. RenderingStrategies. For the sake of convenience, some of them require a callback to resolve the location of a certain entry. This way we can keep the coupling fairly low but it isn’t straightforward to serialize;
it requires a lot of memory. Right now, Dokka is probably not the most efficient application memory wise, but after every child task finishes, the memory is garbage collected, so it is not a major problem. Having this in mind, it is almost impossible to have a full Pages model in memory during rendering since it would need insane amounts of it.

The second approach appeared to be way more efficient because it requires hardly any memory to work and should be quite fast. While implementing it, we decided that each gap in the template requires some context, therefore it contains a serialized Command that informs how it should be processed. For example, the ResolveLinkCommand has a unique DRI (Dokka Resource Identifier), that is used to link between entries, so the link can be resolved properly.

How it ended up

We realised that a huge advantage of this solution is that most popular commands like ResolveLinkCommand can be substituted in place so that no state is required. That greatly increases the efficiency of this solution and decreases complexity. Of course, it is not all roses and unicorns and some commands like AddToNavigation have to be collected and written to the file at the end.

Another aspect of the solution is pluggability. After The Great Rewrite (so after the 1.4.0 version), Dokka embraced the idea of creating plugins for your documentation. This is carried out even to the multi module, as this functionality is a plugin to keep the core of the application as thin as possible. It allows plugin creators to easily customize how their multi-module page or navigation looks like. What is great is that the plugins can be shared between users just like regular libraries or Gradle plugins when they want to, but can also be written and used locally without publishing, for example with a separate Gradle subproject.

Default all-modules page in the current solution.

Package page in the current solution, the link to ClassA is present with autoconfiguration. Notice the full navigation on the left side as opposed to the only top-level documentation in the previous solution

Overall, we are quite happy with the new multi-module approach and are eager to see whether it will stand the test of time or not.
Right now some of the early adopters have migrated to the new multi module experience, here are a few examples:

If you would like to be one of them, here is a basic guide on how to add Dokka to your project.