Rethinking web content editing with Cobalt

A look into Cobalt and how it handles annotations

Published in

SimplyEdit

8 min readJan 15, 2019

Cobalt is part of our research in improving the usability of rich text editing for the web. Cobalt builds on the concepts of Hope, which is part of SimplyEdit. The aim is a more intuitive editing experience with a simpler and more flexible programming API.

Cobalt is a new way of marking up text, that is based on very old ideas. Instead of adding markup tags inline in a text document, the markup is added as an annotation in an external document. The annotation references a specific range of the original text, without changing the original.

This has the obvious disadvantage that we now need to keep two documents in sync. To do this, you need a specific editor for this new format, you cannot use a generic text editor. In the 1970s, when most of the technology we still use today was designed, this was a major drawback. Today’s editors are so complex that this should no longer be such a problem (I hope).

The basic idea

So the basic idea of Cobalt is to not do this:

<h1>This is a title</h1>
<p>And a <strong>bold paragraph</strong></p>

But instead do this.
Text:

This is a title
And a bold paragraph

The annotations:

0–15:h1
16–36:p
22–36:strong

Annotations live in their own document and reference the source text by character offset. Annotations may overlap, there is no inherent tree structure here.

Though the first part is now much simpler, it is just plain text, the second part with the markup is clearly impractical to type manually. Any time the source text changes, the annotation offset must change as well.

But if there is an editor that reads both documents and automatically updates the offsets in the annotations when you type in the source document, then there are some interesting benefits to this approach.

The benefits

First, the source document isn’t affected by the annotations. This not only means that you can annotate documents that have a specific format, without affecting their other use cases, you can also annotate documents that you don’t have write access to. You can even have more than one annotation document for the same source document. But you must be certain the document doesn’t change.

This is actually doable today, either by referencing versions in a version control system like Git or by referencing documents on a system like IPFS.
To keep annotations correct for newer versions, you need a robust diff algorithm or extra metadata. This is still in the realm of the possible, though more complex. However because the source document is always referenced by character (or grapheme) offset, the problem is a lot less complex than you might think.

Another advantage of annotations is that the API you need to alter the document and annotations is much more simple than one for a markup language. Annotations do not force tree structures to appear in the text or in the annotations list itself. There are some use cases where the tree structure has benefits, but these are usually more in the realm of application building or page design. When editing content, these tree structures only make life more complicated.

The problem with trees

Take HTML for example. There are many so-called WYSIWYG (What You See Is What You Get) editors for HTML. Some even work in the browser. All these use a flat range approach in the user interface. You can place a cursor somewhere and move it forward or backward through the content. In effect, it masks the underlying tree structure. You can select a range, which has a start and end point as if the content was one-dimensional. This is how humans think about the text. Using annotations this becomes the underlying truth as well.

For a concrete example of the advantage, say you want to change a piece of content by marking up a range to become bold. The HTML looks like this:

<ul>
 <li>This is line one.</li>
 <li>And this is line two.</li>
</ul>

To make the text bold from ‘line one’ to ‘line two’, the resulting HTML must end up like this:

<ul>
 <li>This is <strong>line one.</strong></li>
 <li><strong>And this is line two</strong>.</li>
</ul>

The <strong> tag is split into two because it cannot cross over the <li> tag. This is a relatively simple case, but with more complex documents, the code to mark this up correctly gets positively gnarly.

In contrast, here is the same for annotations.
Text:

This is line one.
And this is line two.

The annotations:

0–39:ul
0–18:li
19–39:li
8–38:strong

There is no need to split the annotations. There is never a need to change an annotation because of other annotations. Each annotation is blissfully unaware of other annotations. They only reference the source document.

The result is an API that can be much less complex and obese than the DOM API for HTML. To change the text, all you need is the basic text editor operations: cut, copy, insert, delete, and search. To define positions, all you need is a one-dimensional range. Ranges behave similar to sets, with added restrictions. So for ranges, you get basic set operations: union and intersect, and range specific delete and insert. Finally, the annotations list is just a list, you can use normal array operations on it.

A glimpse into the future

A final example showing what an API for Cobalt could look like, and an operation that is simple in Cobalt and almost impossible in normal markup languages:

 var fragment = cobalt.fragment(text, annotations);
 var range = fragment.annotations.filter(‘p’).getRange();
 fragment.text.select(range).search(/^\w/i).annotate(‘strong’);

This code is fictional, there is no API that supports this yet, but there could be. The code searches for paragraphs and then searches for the first word in each paragraph and makes it bold. The paragraphs are entries in a one-dimensional list, so a simple filter will do. The words are just plain text, so a standard regular expression is enough, although in this case limited to the selected range. Finally, a search returns a range, which you can use to add a new annotation.

Note that ranges may contain multiple sets of simple ranges, so although you get a single range object from getRange(), it may reference multiple regions in the text. A range can be something like this: 10-19,30-36

To get the same effect using the HTML DOM API would be a lot more difficult. The reasons are that you must assume that the first word of a paragraph may be split over multiple HTML elements. You must do a tree walk over each paragraph to find the first word. Then you must do a tree walk again to apply the markup, taking care to only add the markup where it is valid to do so. This is not a trivial problem. You can opt to use the browsers builtin code, using execCommand(‘bold’), but this limits you to the markup that is supported by execCommand and anyway, the complexity is still there, just solved for you by anonymous browser developers.

So where is Cobalt today?

Cobalt is based on an earlier prototype, called Hope. This implemented most of the needed features and was meant to research the possibility of building something like this. In that regard its a success. In fact, the API it provides to manipulate HTML is so much simpler than the default DOM API, a colleague used the code and integrated it into a browser-based WYSIWYG-editor he wrote (SimplyEdit).

Cobalt focuses on implementing a more correct and complete version. Most notably it has a consistent and orthogonal Range Algebra. Any operation will always return a valid Range again. This meant that ranges can become complex, consisting of multiple basic ranges, as mentioned before. Whenever you cut the middle out of a range, you end up with two separate ranges. So a single Range must be able to have separate regions.

Another focus for Cobalt is the correct rendering of a cobalt document to HTML. This is what the world knows and browsers can render. By supporting it correctly, there is a bridge to use Cobalt in real-world scenarios and I don’t have to make a rendering engine myself. But this is not a solved problem yet. As mentioned, it is pretty difficult to generate correct HTML for all cases. Cobalt adds the fact that you can layer overlapping annotations, allowing for many combinations that are illegal in HTML. It is quite a challenge to make a robust HTML rendering that results in the most complete but still correct HTML version possible.

Then there are a number of choices made in how WYSIWYG editing works today, that seem to have their origin in how HTML itself is constructed. This has to do with how whitespace is treated for example. Since HTML is designed to be edited by hand in a plain text editor first, and it contains tree structures, it explicitly supports indentation. This means that it must ignore the whitespace used for indentation in the rendering. So all consecutive whitespace is rendered as a single space by default. If it is between block elements, it gets ignored entirely. Only if you specify a nondefault setting in CSS will you see the whitespace as it is. Since Cobalt has no markup in the source document and has no tree structures entirely, there is no need to handle whitespace in any other way than normal characters. However, browsers will still do this if you don’t say otherwise.

Finally, the split into two types of elements, block and inline, has forced a set of operations on WYSIWYG-editors. When changing block markup, it is automatically applied to the current block, regardless of your selection. Using Cobalt we may find there are other options that may be more intuitive. The same goes for default actions tied to pressing `Enter` or `Return`. I think there is room to re-evaluate how a WYSIWYG-editor responds to specific commands if the underlying document model is radically different.

The current work focuses on building a consistent and user-friendly API and use that to build a simple and user-friendly editor. The editor implementation is used to verify that the API is complete and of high enough quality. The editor is meant to be run in a browser, so a high-quality HTML renderer of Cobalt is part of this effort.

Once this is finished, the focus will shift to improve the performance of Cobalt. In its current state, it is impractical to work on large documents. There is no reason to assume Cobalt will perform worse than current WYSIWYG-editors, it might in fact do better.

The focus on replacing current WYSIWYG HTML-editors also means that the annotation language uses the same tags and attributes as HTML. There is no reason this must remain so in the future and every reason to assume there may be a better way to annotate documents than that.

If I’ve piqued your interest, check out the Cobalt GitHub repository at https://github.com/poef/cobalt/ and try it out yourself. The code and API’s are bound to change as we figure out how annotations work. If you have any suggestions or want to me help out, send me an e-mail or just add an issue on GitHub.

This blog is written by Auke van Slooten, our senior developer.
Connect with us at GitHub, on social media, or leave a comment below.