Efficient utilisation of the DITA-OT plugin for AEM Guides Migrations

Ajay Shankar
Activate AEM
Published in
6 min readAug 22, 2023

In this blog post, we’ll see how effectively the DITA-OT Plugin can be used to migrate different formats to the AEM Guides (formerly known as Adobe XML Documentation) supported format.

In specific, information about the plugin used for converting XHTML to DITA format is provided along with all the DITA-OT configuration instructions.

XSLT mapping rules that enable customisation capability during the conversion are also given, along with examples and best practises to follow.

What is DITA Open Toolkit (DITA-OT)?

The DITA Open toolkit (DITA-OT) is used for providing DITA contents in a deliverable format (PDF, HTML5, Eclipse Help, XHTML, HTML help), It is basically a set of Java-based open source tools.

Configuring DITA-OT

Prerequisites

To work with DITA-OT Plugin, Java Runtime Environment or Java Development Kit should be installed, DITA-OT can run on Java version 8u101 or later and it’s compatible with Oracle JRE/JDK, Eclipse Temurin and a few other OpenJDKs listed here.

Usage

Install the latest DITA-OT zip package from the official site (https://www.dita-ot.org/download) and extract the contents of the package.

Check your DITA-OT version using the below command,

dita –version

There are some default plugins available within the DITA-OT package (under /plugins folder). A few of them are listed below,

org.dita.html5

Plugin used to convert from DITA to HTML5 format that generates HTML5 output and a table of contents(TOC) file as part of the transformation.

dita --input=input-file --format=html5

org.dita.pdf2

Plugin used to convert from DITA to PDF2 format that generates output in a portable document format.

 dita --input=input-file --format=pdf

org.dita.xhtml

Plugin used to convert from DITA to XHTML format that generates XHTML output and a table of contents(TOC) file as part of the transformation.

NOTE : XHTML is a stricter and more standardised version of HTML

 dita --input=input-file --format=xhtml

They can be used directly with the help of the command,

dita --input=input-file --format=format [options]

Here,

Input-file : DITA Map or file that is to be processed

format : Required Output format (eg : html5, pdf, xhtml etc)

[options] : Build parameters (eg : — debug or -d , — output=dir or -o dir , — filter=files etc)

see more : https://www.dita-ot.org/dev/topics/build-using-dita-command.html

How to install External Plugins as part of DITA -OT?

DITA-OT supports some additional plugins (For E.g. plugin : com.elovirta.ooxml , use : to convert Dita to MS Word format). The plugins which are not available in the DITA-OT package can be installed using the following command,

dita install <plug-in>

<plug-in> : This can be any plugin from the official plugins available in DITA-OT (https://www.dita-ot.org/plugins)

Everything that DITA-OT provides including the additional plugins is used to convert any DITA File or Map into other required formats. There will be some use cases in which the vice-versa of this process will be required (For e.g.: XHTML to DITA). In order to achieve this, we’ll be requiring external plugins from the official repository of DITA-OT. (https://github.com/dita-ot/ext-plugins).

As an initial step, clone this repository with your local folder and from there, we can access any plugin that is required.

Every plugin contains a build.xml file which is the most significant file that does the job of your conversion and an XSL file that is used for matching patterns during the conversion and further the XSL can be altered as per your requirements and custom rules also can be included.

H2D Plugin

The H2D Plugin is used for converting XHTML files into DITA files, which is required in a use case like migration of HTML contents from a site to AEM by leveraging the XML editor within AEM.

Further, the generated DITA files can be viewed with the help of different modes like Author,Source and Preview.

Configuring H2D

Prerequisites

Apache Ant should be installed in order to use the ant script for the process of conversion from XHTML to DITA via the H2D plugin.

Usage

Open the terminal from the h2d plugin’s root directory and use the ant script command as follows,

ant -Dargs.input={file|directory} -Dargs.output={directory} -Dargs.infotype={topic|concept|task|reference}

Here,

input : Input XHTML file/directory

Output : Output directory for storing generated DITAs

Infotype : Type of DITA file you need

Once the build is successful, you can view the generated corresponding DITA file of our XHTML. Here, the tags in XHTML are mapped to DITA elements based on the mapping rules that are present in the h2d.xsl file.

XSLT Mapping and Rules

In the XSL file that is present in the plugins folder (like h2d.xsl), the mapping rules that are existing can be edited with respect to the DITA standards that are followed.

Actual Rule

Modified Rule

Apart from modifying the rules, a new mapping rule can also be introduced based on the DITA requirements as follows.

New Rule

In the actual rule, matching is done based on the ‘div’ and it can be modified to any other element matching that is required (eg : modified to ‘unknown’ in the Modified rule) and a new rule can also be created as given in snapshot under New Rule if matching rule based on custom conditions is required (eg: iframe matching with conditions based on its attributes)

Best Practices

— Never modify core code in the files other than adding custom mapping rules.

— Try to maintain a similar pattern of rules in mappings to avoid issues as well as to make it work as per the required use case.

— Use comments to differentiate custom mappings, so that it would be easier to debug.

— Reuse only the required templates and attribute sets for your custom plug-in requirements rather than copying the entire DITA-OT files.

— Make sure to use attributes wisely to specifically target elements and also to avoid conflicts with the pre-existing rule.

— For more detailed conventions, refer to this link : https://www.dita-ot.org/dev/topics/plugin-coding-conventions.html

How to debug?

Enabling Debug Mode

  1. Using Commands

2. Using Property

Add the following property to an Ant target in your build file

<property name="args.debug" value="yes"/>

3. Using Command Line

dita –help

Refer to the link below to understand more about DITA-OT specific error messages :

Summary

The DITA Open Toolkit provides an effective way to convert dita formatted files to any required format and vice versa that a project demands during a pre-processing step. It is recommended to refer to the official documentation (https://www.dita-ot.org/dev/) during configuration steps and to check the availability of required plugins as well.

The templates and customisation capabilities it provides are its major advantages and it has a set of plugins that are readily available as well as many others that can be easily installed and used for specific purposes. The external plugins discussed above are the added advantages of the toolkit which can be used in the same way as a default plugin after proper installation.

It also has dedicated community support (https://www.dita-ot.org/support) and is also open for contributions which further enables support for additional plugins (https://github.com/dita-ot/ext-plugins) like the H2D plugin used in the AEM Guides Migration project.

References

http://dita-ot.sourceforge.net/1.5.3/

https://www.dita-ot.org/dev/

--

--