The Page Builder Chronicles: Part 1 — Turning HTML to Markup

Jason Wandrag
8 min readMay 31, 2023

--

The Magic of HTML to Markup: Unleashing the Power of JavaScript by Transforming HTML into Objects!

As part of my training as a Web Wizard, I decided to give myself a challenge: I want to create a rudimentary page builder application with 0 dependencies. This means that I would need to create all the functionality to create and delete an element, as well as update the content and styles of the element. I also need to be able to save the work I have done, as well as upload previous work and edit it.

Working backwards, this leads me into Part 1 of this series, which tackles how to save my HTML as a JSON object.

Planning the structure of the JS object

When thinking about how I want to save the JS object, I need to think about how HTML is structured in general.

My first thought, is that HTML is made up of tags, and the content of a tag can be either another tag, or some text. This means that the first 3 properties I need to think about are:

  1. The type of content, which is either an element or text.
  2. If the type of the content is “text”, then the only other property I care about is the content of the text (which I will refer to as tagText, otherwise known as the text inside the tag), otherwise:
  3. If the type of the content is element, then I need to create a property to contain the children of that tag, which would be either another elementMarkupObject or a textMarkupObject.
const textMarkupObject = {
contentType: "text",
tagText: "Lorem Ipsum",
};

This is the markup for an HTML Text Node

const elementMarkupObject = {
contentType: "element",
children: []
}

This is the markup for an HTML Element

The next thing that I need to account for, is that HTML has many tags, each with their own tag name. Each tag also can have attributes that give more information to the tags. The updated structure for the Element Markup would be as follows:

const elementMarkupObject = {
contentType: "element",
tagName: "div", // This would be any valid HTML tag name
children: [], // An optional array of markup objects
attributes: [], // An option array of HTML attributes
};

Lastly, I would like to give each element a unique ID, so later I can easily identify the HTML element that the markup references. Since the text of an element is contained in an elements textNode property, I do not need to give the text markup itself an ID. The final structure of my element is as follows:

const elementMarkupObject = {
componentID: 1, // This will be unique for each ELEMENT object
contentType: "element",
tagName: "div", // This would be any valid HTML tag name
children: [], // An optional array of markup objects
attributes: [], // An option array of HTML attributes
};

Creating Unique IDs for the markup

When creating IDs for each component, I want to have a simple way to generate a new number and keep track of that number, without having to include a library to generate a number. I also dont want to use a random number generator, as I want to ensure that no 2 IDs are the same.

To do this, I am making use of one of JavaScripts newer features: The Generator Function.

function* componentIdGenerator() {
let index = 0;
while (true) {
yield index++;
}
}

Here we have a magical function called componentIdGenerator() that can create unique identification numbers for components in a system. Every time we call this function, it produces a new number, ensuring that no two components have the same ID.

Inside this magical function, we have a special keyword called yield. It's like a pause button that lets the function take a break and remember where it left off. Here's what the function does step by step:

  1. We start by setting up a variable called index and give it the initial value of 0. This index will keep track of the current number we want to generate.
  2. Now comes the interesting part. We have a loop that says while (true). This loop runs indefinitely, forever! It means that our magical function will keep generating new numbers as long as we ask for them.
  3. Inside the loop, we find the yield index++ statement. This line does two things at once. First, it pauses the function and gives us the current value of index. Second, it increases the value of index by 1, so that the next time we ask for a number, it will be one higher than before.

To put it simply, when we call the componentIdGenerator() function, it creates a special "generator object" that remembers the state of the function. Every time we want a new component ID, we call a special method called next() on the generator object. This method resumes the function from where it left off and gives us the next number in the sequence.

By using this function, we can ensure that every component in our system gets a unique identification number, starting from 0 and increasing by 1 each time. It’s like having a magical factory that produces an endless supply of IDs for our components, making sure they are all distinct and never repeated.

Next, I create a variable to help keep track of the current value of the generator:

const generator = componentIdGenerator();

With this, I can set the componentID property of each elementMarkdownObject to have the value of generator.next().value . This will return the next ID of the generator

Markup Factory Functions

Now that I have an ID generator, I want to focus on creating factory functions for each aspect of the markup. The factory functions I want to create are the following:

  1. createText which will create a text node
  2. createAttribute which will give an element extra information like styles, classes and IDs.
  3. createElement which will create an element with its necessary options and information.

First, lets start with the simplest of these factory functions: createText , which takes the text as an parameter and returns an object that looks like our textMarkupObject .

const createText = (tagText) => {
return { contentType: "text", tagText };
};

Next, lets create out createAttribute factory function, which should take in an attribute name and an attribute value. I also want to accomodate for attributes like contenteditable, which do not have to have a value submitted, although the presence of attributes like this will be equivalent to contenteditable=true. This means that if no value is submitted, it should default to a value of true

const createAttribute = (attributeName, attributeValue = true) => {
return { attributeName, attributeValue };
};

Lastly, I want to create my createElement factory function, which will take in a tag name, as well as an optional parameter for options. The options parameter refers to the attributes and children the element may have. This function will also make use of my ID generator to give each element a unique ID:

const createElement = (tagName, options) => {
return {
contentType: "element",
tagName,
...options,
componentID: generator.next().value,
};
};

Checking if an element has any text as a direct child

Before creating the function that will compile my DOM object into a simpler JS markup object, I need a small helper function, which will determine if there is text as a direct child of an element. This is needed, as the textContent value for an element will refer to the first text it finds within all its children AND grandchildren.

If this is not accounted for, we will get a textMarkupObject all the way up our markup tree.

function hasTextChild(element) {
for (let i = 0; i < element.childNodes.length; i++) {
const child = element.childNodes[i];
if (
child.nodeType === Node.TEXT_NODE &&
child.textContent &&
child.textContent.trim().length > 0
) {
return true;
}
}
return false;
}

Combining these functions to compile HTML to a JavaScript object

Now that I have all the individual features I need to turn HTML into a JavaScript object, I need to create a function called createMarkdownForElement that brings this all together.

This function will take in 2 parameters, tagName and element. tagName refers to the type of tag you want to create, and element refers to the actual DOM element that we are looking to turn into markup.

export default function createMarkdownForElement(tagName, element) {
// Code goes here
}

Now, to recap:

  1. An HTML element can have children, which are either more elements or text. I plan to use recursive programming to handle this.
  2. An element also has attributes to give it more information.

Lets first start with creating the children for the element:

const children = element.children.length
? Array.from(element.children).map((child) =>
createMarkdownForElement(child.tagName.toLowerCase(), child)
)
: [];

Here, a few things are happening. Firstly, I am using a ternary operator to check if there are any children inside the element I have selected.

If there are children, I map over these children and recursively call the createMarkdownForElement function, passing in the child elements tagName as the first argument, and the child itself for the second argument. This will recursively build the HTML structure of our markup.

If there are no children, I return an empty array instead.

const attributes = element.attributes?.length
? Array.from(element.attributes)
.map((attribute) => {
return createAttribute(attribute.name, attribute.value);
})
.filter((attribute) => attribute)
: [];

For the attributes of the element, I take a similar approach to the creation of the children, however I wrap the element.attributes in an Array.from() method, as the attributes on a DOM element has the type of NamedNodesMap , which is not an array, but is iterable.

Next, I want to see if there is any text as a direct child of the element and add it to its children using the createText factory function:

if (hasTextChild(element)) children.push(createText(element.textContent));

Lastly, I want to pass this data through my createElement factory function and return the data that is created:

return createElement(tagName, { children, attributes });

This nicely wraps up the function to give us back a JS object that represents the structure and details of the DOM element we pass in. Here is what the whole function looks like:

function createMarkdownForElement(tagName, element) {
const children = element.children.length
? Array.from(element.children).map((child) =>
createMarkdownForElement(child.tagName.toLowerCase(), child)
)
: [];

const attributes = element.attributes?.length
? Array.from(element.attributes)
.map((attribute) => {
if (attribute.name === "contenteditable") return;
return createAttribute(attribute.name, attribute.value);
})
.filter((attribute) => attribute)
: [];

if (hasTextChild(element) && element.textContent)
children.push(createText(element.textContent));

return createElement(tagName, { children, attributes });
}

How to use the markup generator

In order to use this function, we need to select a DOM element we want to turn into JavaScript, and pass the element’s tagName and the element itself into the function:

const elementToConvert = document.querySelector('body');
const elementMarkdown =
createMarkdownForElement(
elementToConvert.tagName.toLowerCase(),
elementToConvert
);

OPTIONAL: If you want to save this object as a JSON file, just add the following code within the same function. This will create a link to download the JSON file and remove the link right after:

const dataStr = "data:text/json;charset=utf-8," + encodeURIComponent(JSON.stringify(elementMarkdown));
const downloadAnchorNode = document.createElement("a");
downloadAnchorNode.setAttribute("href", dataStr);
downloadAnchorNode.setAttribute("download", "markdown.json");
document.body.appendChild(downloadAnchorNode); // Required for Firefox
downloadAnchorNode.click();
downloadAnchorNode.remove();

You can view an example of this here.

Conclusion

With this, I have an easy way to save any HTML element as JSON. You can view and use this code from either my GitHub repository or install from NPM

I would appreciate any helpful tips and tricks to further improve myself, and if you have found this blog helpful, please comment and let me know!

Please follow me so that you don’t miss Part 2, which will focus on the opposite side of this functionality: Taking our JSON, compiling it into a JavaScript object, then rendering our HTML

Contact me here: LinkedIn | Codepen | Email

--

--

Jason Wandrag

With a passion for front end development, I am aiming to make the web a beautiful place by innovating how we consume information and interact with the web.