Building Content Kit Editor on ContentEditable

Part of the fun of working on the Content Kit Editor for Bustle has been that it has given me a new perspective on the text-editing experience. Moving a cursor around a document, applying formatting to a selection, rearranging paragraphs — word processing, in other words — feels natural when doing it, but specifying it (the first step in coding it) reveals some non-obvious and even unintuitive facts.

For instance, if the cursor is positioned as shown below, where will the cursor be after hitting the up arrow (↑)?

Here’s another. If the cursor is positioned as shown below and you type the letter “a”, will it be bold or plain?

The answer to both of these questions is, “it depends.” It depends on what editor you’re using, what browser you’re using (if you are editing with a browser-based tool), and what you were doing previously.

In the case of the up arrow, it depends on how the cursor got to where it is (and what editor you’re using). In most editors, if you used the down arrow (↓) to move to the empty line, the up arrow takes you back to where you had been in the non-empty line above it. It is a path-dependent process. But if you had used the mouse to put the cursor on that line, the up arrow moves you to the start of the previous line. Or, if you’re editing with Medium’s editor, the up arrow always moves you to the start of the line above you, no matter what.

In the case of the bold text, most editors will insert a plain-text letter “a” when you type. But, if you’re using Google Docs and you had just backspaced over a bold character, your “a” will be bold. Or, if you’re writing text in Firefox using contentEditable, the text will be plain text if your cursor approached that position from the left but bold if the cursor approached it from the right.

Here’s a simple example of a contentEditable div that you can use to experiment with, if you want to try out the scenarios I described above.

The point is, there ends up being a lot of nuance in how seemingly self-evident actions like moving the cursor and entering text are implemented.

In order to make the editing surface of the Content-Kit Editor feel right, part of our strategy has been to minimize the amount of this nuance that we have to explicitly define (so that we can explicitly code) ourselves. Enter contentEditable.

contentEditable, very briefly, is an html attribute that you can set to make an html element editable by someone viewing the web page. A cursor shows up when they click in it, and they can type text to change what is shown there, in a what-you-see-is-what-you-get (wysiwyg) style. This attribute, which has existed since Internet Explorer 5.5 (in 2000!), theoretically allows the user full wysiwyg control over a part of the page. In practice, of course, despite having been around for 15 years, contentEditable is under-specified and implemented differently by all the browsers, and is in fact somewhat terrible.

One thing it does a pretty good job at, though, is provide a reasonable editing surface. It handles up arrows, down arrows, using shift plus arrow keys for selection, deleting text, etc, fairly well.

Content Kit Editor thus leans on contentEditable to provide the base layer of a reasonable edit surface, and our strategy has been to slowly chip away at it by capturing, canceling and re-interpreting native events that may do something unexpected, like cause the browser to insert unwanted “bookkeeping” HTML. So far this has meant that actions like hitting enter (which can insert redundant <br> tags) or delete in certain situations, as well as anytime a keystroke happens while text is selected, need to be “handled”. We need to re-interpret these events into their semantic equivalents, apply that semantic change to Content Kit’s internal representative data structure, and prevent the browser from doing what it might have otherwise done (namely, muck with our HTML).

This provides a good balance between the control we need (to keep our HTML clean and in sync with the internal data structure) and our desire to not go down the rabbit hole of writing an entire text- and cursor-positioning layout engine. Different editors are at different points along this continuum between control and complexity. Editors that have a markedly different visual representation from their internal representation may opt for more control. Google Docs, at the far end of the continuum, processes all user input and goes so far as to draw the entire editing surface, including the cursor.