Content Engineering

Insights for publishers and developers embarking on digital content initiatives.

Supercharge your content transforms with ContentBase

Introduction to the XSL Transform API

Wrycan
Content Engineering
4 min readOct 22, 2018

--

Many digital content initiatives leverage XML to model and store digital content. Once you have XML-based content, you can produce just about any other output you need.

Example: a straightforward XML representation of a blog post:

<?xml version="1.0" encoding="UTF-8"?>
<post>
<title>XML is great</title>
<body>
<para>
XML is <strong>great</strong>, don't you think?
</para>
</body>
</post>

If you are familiar with HTML, you may notice that while the tag names (the tags are the bits between the <> symbols, <post> for example) might be different, the syntax is the same. In fact, XHTML/XHTML5 are XML documents that use tags defined by the HTML specification, tags including<p>,<b>,<html>,etc.

The primary way of converting XML to other formats is by using XSL transforms. XSL is a functional programming language written to process XML by traversing the XML tags. Here is a sample of a simple XSL that transforms an XML document:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:fo="http://www.w3.org/1999/XSL/Format"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
<xsl:output method="html" encoding="utf8"/><!-- start at the root (/) -->
<xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>
<!-- every tag (element) will be handled by this template-->
<xsl:template match="*">
<xsl:value-of select="local-name()"/>,
</xsl:template>
</xsl:stylesheet>

This XSL will process every tag (element) in the XML and output the name of the tag (local-name()). If we run this XSL on our sample XML, the result would be:

post,title,body,para,

This article isn’t meant to be an XSL tutorial; we won’t dive into the XSL syntax other than this basic example.

If we want to turn our <post> XML into a simple HTML page, the XSL might look like this:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:fo="http://www.w3.org/1999/XSL/Format"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
<xsl:output method="html" encoding="utf8"/><xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="post">
<html>
<head>
<title><xsl:value-of select="title"/></title>
</head>
<body>
<xsl:apply-templates/>
</body>
</html>
</xsl:template>
<xsl:template match="title">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="body">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="para">
<p><xsl:apply-templates/></p>
</xsl:template>
<xsl:template match="para/strong">
<span class="em"><xsl:apply-templates/></span>
</xsl:template>
</xsl:stylesheet>

HTML result would be:

<html>
<head>
<title>XML is great</title>
</head>
<body>
<p>XML is <span class="em">great</span>, don't you think?</p>
</body>
</html>

Simple. But what if you wanted to include some metadata stored in ContentBase about the <post>, how would you do that using just XSL?

Let’s say you store descriptive tags about post content not in the XML, but as properties in ContentBase. Storing this type of data in the repository means you aren’t altering the original content. So in ContentBase, you have a property on the <post> content type called “DescTags.” There can be 0, 1 or a bunch of values stored in this property for each <post>. The values are just simple strings like “Technical,” “Marketing,” “Press,” “News,” etc.

To do this, you can use the XSL Transform API.

When you run any XSL in ContentBase, there are three parameters passed to the XSL by default. These should be defined near the top of the XSL like this:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:fo="http://www.w3.org/1999/XSL/Format"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
<xsl:output method="html" encoding="utf8"/><!-- three params passed from ContentBase to XSL-->
<xsl:param name="contentId"/>
<xsl:param name="user"/>
<xsl:param name="xmsPublicationContext"/>
...</xsl:stylesheet>

contentId — the ContentBase ID assigned to the XML document processed by the XSL.

user — the system username of the person running the transform.

xmsPublicationContext — a handle to the XSL Transform API, used to call all the API functions.

So let’s add the DescTags, comma-separated, to the bottom of the HTML version of the post. The code snip below shows how the XSL for the <body>tag would change:

...<xsl:template match="body">
<xsl:apply-templates/>
<div class="tags">
<xsl:for-each
select="java:getProperty($xmsPublicationContext,$contentId,
'DescTags')">
<xsl:value-of select="."/>
<xsl:if test="position() != last()">, </xsl:if>
</xsl:for-each>
</div>
</xsl:template>
...

The important bit is

java:getProperty($xmsPublicationContext, $contentId,'DescTags')

The getProperty function call takes the ID of the content you want to query properties for as the second argument, and the name of the property to return as the third argument.

In the example above, the function call will query the repository for the ‘DescTags’ property and return the values as a NodeList. Again, this isn’t an XSL post, I won’t go into details about the XSL syntax. The key here is how you call a function from the XSL and deal with the return values. The getProperty function is a useful example, but there are more than 50 functions available in the ContentBase XSL Transform API. We will be talking more about different API functions in future posts.

Want to chat more about ContentBase and the XSL Transform API?

Use the contact form on our site to get in touch. We will not share or publish any of your contact information. Period. Read our privacy policy for more detail.

About Wrycan, Inc.

Founded in 2003, Wrycan is a Content Engineering company that helps content production and product development work together successfully. Wrycan, Inc. delivers Content Engineering services and technology to publishers and content producers to bridge the gap between content production and digital delivery. Companies building web and mobile platforms leverage our turn-key ContentBase platform to deliver content to their customers successfully.

Wrycan is headquartered in Kendall Square, Cambridge, Massachusetts.

For direct inquiries: inquire@wrycan.com or visit http://www.wrycan.com/home/contact

--

--

Content Engineering
Content Engineering

Published in Content Engineering

Insights for publishers and developers embarking on digital content initiatives.

Wrycan
Wrycan

No responses yet