ContentEditable — The Good, the Bad and the Ugly

Every once in a while some developer notices that there’s still no perfect WYSIWYG editor for the web on the market and then this happens:

How hard can it be? We’ve got contentEditable, execCommand() and queryCommandState(). Now, let’s just add some toolbar with stylish buttons to enable applying bold and linking the text. We’ll work a bit on typography, use some trendy SVG/font icons for the buttons, add some CSS transitions here and there and we are mostly ready. Right?

Just a couple of details left… How do I force the bold command to use <strong> instead of <b> (choose whichever you like more, actually) in every browser, force the Enter key to create new paragraphs instead of those ugly <br> or <div> and fix that damn pasting. I don’t want all those inline styles!

A month and several questions on StackOverflow later the project is either dead or full of most brutal hacks which seem to work (that’s actually the worst situation — poor end-users). The developer joins the “contentEditable is evil” club and hates WYSIWYG editors even more.

Those three small details — how the bold command and Enter key work plus cleaning up pasted data are just the tip of the iceberg, but they are already enough to make even a skilled JavaScript developer hate “this entire contentEditable”.

Let’s consider how the behavior of the bold command can be altered. I’ve seen many approaches from the hacky ones like using mutation observers, through normalisation of HTML when retrieving it from the editor, to a complete (in my opinion, the only real) solution which is reimplementing the execCommand() behavior in JavaScript. But again, this is only the tip of the iceberg.

If you chose the mutation observers, you may need to deal with preserving selection (as it may be reset when you modify the DOM around it) or with the undo manager.

If you accepted normalisation of the output you’re pretty safe, but I can hardly imagine the same approach to the Enter key support and well… it smells bad.

If you’re a purist like me, you would perhaps try to implement execCommand() in JavaScript. The algorithm is pretty straightforward — get the selection, get ranges (did I mention that Firefox supports multi-range selections?), do a couple of DOM operations, make selection, implement undo manager (because your changes won’t be recorded by the native one) and you’re safe. Except that:

  • You were fired for delaying the project for a few months.
  • Your tests don’t pass in any other browser (oh and did I mention that Blink and WebKit’s selection systems are broken? Yes… the bug is 8 years old) and you’re about to implement one of the most ugly hacks you’ll ever see.
  • First bug reports start to appear and you start to realise how many more cases your implementation needs to support.
  • You still need to implement the Enter key support and do something with the damn pasting.
  • You’ve just noticed that you also need to reimplement Backspace and Delete key support in Blink and WebKit as these engines love inline styles far more than you do.
  • You’ve noticed that some engines return a href attribute with an absolute URL even though you made a link with a relative URL.
  • Gosh… we only talked about four most basic features — when will you add support for images?
  • Didn’t you know that the selection has a direction?

Enough. I think I’ve made the point. ContentEditable is terrible. Related APIs and their implementations such as selection, clipboard and drag and drop are incomplete and/or inconsistent and buggy. Range API is complicated and inconvenient. But your fridge isn’t empty yet and now you know what to do.

Goodbye, contentEditable

The idea is simple. ContentEditable is evil and the Selection API is its evil twin. Avoid them as much as possible. So what’s the plan?

  1. You need a custom selection system. It must be doable nowadays to get position (represented by a range, but we’ll simplify this in a moment) in the DOM on mousedown/mousemove. You’ll display your custom caret (whoa… you can control its style now) and text selection (it’s a simple <span>).
  2. You need to handle typing. Let’s listen to the keyboard events and insert given character into the editor.
  3. You need to handle navigation using the Arrow keys. Left and right seem easy, up and down a bit trickier but if you have point 1, you’ll do this too. Wait, there’s the Alt modifier which makes the Left and Right Arrow keys jump entire words. But of course you’ll look for spaces in the text and that’s it — told you, it’s trivial!
  4. Then you need to do something with pasting. You see that with the Clipboard API you can listen to the paste event on the document and retrieve the data from the data transfer. You can also use the old paste bin mechanism.

The world is beautiful again. You’re surrounded by well known APIs or your own code which you can finally control. Say bye to that terrible feeling of being hopeless. You use the Selection API and ranges only for a direct interaction with the browser, but internally you implemented different mechanisms, more appropriate for text editing. No need to work on that ambiguous DOM. Only text, styles, indexes and translation to DOM. Applying bold is now a simple algorithm — add the style to selected letters and your automatic translation will update the DOM.

The food in the fridge is scarce, so you publish your project on GitHub. You start receiving first bug reports, but no worries, no software is perfect from day one.

How Do I Type Polish Characters?
When I press Alt+L I expect that ‘ł’ is inserted, but your editor inserts ‘l’.

Right, let’s see what you can do. You don’t know the keyboard layout, so you can’t simply check for the Alt modifier. Besides, there’re too many languages. OK, there’s KeyboardEvent.key from DOM level 3 but so far it only works in IEs and Firefox. Blink will support it soon, too, so you can wait. There’s a high chance that all major browsers will support it within a reasonable time.

How Do I Type Accents?
I have a Spanish keyboard layout. When I press ‘`’ and then a letter (e.g. `u`) I expect to first see the ‘tick’ and then letter ‘ù’ in the same place.

Wait, what? Does he say that two keys are transformed into one letter? That’s crazy… you can handle these special cases somehow (if you can find out what keyboard layout is in use), but let’s hope that there aren’t many languages which work this way.

How Do I Type Hiragana Characters?
When I start typing this popup should appear and the word I’m currently typing should be underlined. It’s the so-called composition.

Right, composition events. You’re sure you can do something with them (if only the browser would fire them). What’s worse, you find out that the Input Method Engine works differently in every OS and is often integrated with the OS (e.g. it learns new words and implements smart autocompletion). You’re beginning to feel hopeless once again. But then a brilliant idea comes to your mind — you can use a hidden textarea from which you’ll read the input. If you position it close to the caret you’ll even have the popup. You only lose contextual suggestions. There’s also some work on opening the IME API. One day you’ll remove all the hacks!

How Do I Type Using My iPad?
When I tap the editor, the keyboard does not appear.

No focus in an editable field = no keyboard. Surely the trick with a hidden textarea will work. Will it? How do you paste now?

Alt+Left/Right Arrow Should Jump Over One Word
This text ‘ພາສາຈີນແມ່ນພາສາໜຶ່ງທີ່ເວົ້າໃນປະເທດຈີນ.’ contains many words, but your editor handles it as one word.

But where are the spaces?!

Undoing by Shaking Does Not Work on iPhone

Right, it’s bound to Ctrl+Z. Time to use the accelerometer.

Keyboard Is Hidden When I Do a Selection on iPhone/iPad

Right. Focus moves from the hidden textarea to the document…

Spell Checking Doesn’t Work

The horror! The horror!

I Can’t Use Your Editor with a Keyboard and a Screen Reader
Your editor isn’t accessible. Normally when I tab into an editor, my screen reader notifies me about it. Then, when I navigate through the text it reads the surrounding words. It also reads what I type. Nothing like this happens in your editor. BTW. Did you see

No, you didn’t. And you should have.

Back to Square One (and a Half)

I think the lesson learned is that contentEditable may be terrible, but it is already here. I’m sure that for all the issues that I mentioned there will be native APIs one day, but believe me — that day is not yet to come. Standardising such complex features is an extremely tough job, because what you see is still the tip of the iceberg. Even though I’m working with contentEditable for nearly four years (that’s nothing comparing to Frederico Knabben’s 13 :D) I still find W3C’s public-web-apps and public-editing-tf mailing lists eye-opening. Moreover, there are many use cases for editing in the browsers and even WYSIWYG text editors differ among themselves so it would be unfair to lock browsers just for one use case.

The Editing Task Force

Talking about standards — contentEditable needs one, because we need contentEditable. Years will pass before there will be any chance of implementing a full featured, stable and usable WYSIWYG editor without contentEditable. Therefore, the W3C Editing Task Force was created last year (which I and Frederico Knabben joined) where some tough discussions started on what to do with the current situation. I will write a separate post about this effort as it looks really promising.

(Edit: You can read more about contentEditable standardization in ”Fixing ContentEditable”.)

ContentEditable — the Good Parts

ContentEditable is like JavaScript, only Douglas Crockford hasn’t written a book about it yet. But (like JavaScript) it also has its good parts (yeah, I know — the good/bad ratio is debatable). It’s amazing that adding one attribute to an HTML element enables typing, selection, keyboard navigation, spell checking, drag and drop, pasting, undo manager. That all of this is integrated with the OS, that such editor can be used with a screen reader or on a touchscreen device and that it’s well internationalised. Let’s focus on these good parts and forget the bad ones.

Over the past years, while working on CKEditor I noticed that we were gradually replacing native features with our own implementations. It started with a custom behavior of the Enter key, commands (from the execCommand() API, to ability to apply, remove and check state of a specific style like bold), undo manager and intercepted pasting so the pasted content can be filtered out. Then some improvements to the selection system were added (such as locking it when the editor is blurred, which allows the implementation of modals) together with enhanced navigation in tables and completely custom list editing. Since version 4.0, CKEditor has its custom “insert HTML into selection” mechanism and a feature allowing reaching non-editable places. CKEditor 4.1 introduced highly customisable content filtering (no more mess on paste). CKEditor 4.3 brought support for non-editable islands with editable islands inside which required overriding many native systems (selection, keyboard, focus, clipboard). Somewhere in the meantime we implemented custom Backspace and Delete support to workaround Blink’s and WebKit’s broken implementations. Finally, just a few weeks ago we made a final takeover of the clipboard, which means that in some browsers copy, cut and paste operations are fully handled by CKEditor.

It means that today CKEditor does not let the browser do anything to the content except handling typing, some deleting and that’s basically it. At the same time, it still uses the native selection system, keyboard navigation and other APIs such as those related to clipboard or focus management.

With CKEditor 5 we are planning to conclude this process by letting the browser insert text only, but with CKEditor’s control (edit: read more in “CKEditor 5: The Future of Rich Text Editing”). Ironically, we plan to base all the editing algorithms on a custom data model as we agree that DOM is not the perfect tool for this job. Of course, we are reaching the moment where we may face internationalisation issues (like the support for Alt+Backspace) but we are aware of this. What’s more, this is the piece of functionality which may be opened by the browsers as one of the first ones, thanks to the work done by the Editing Task Force. I hope to write more about this in the near future (edit: read more in ”Fixing ContentEditable”).

Editing Framework

All this sounds nice, but the amount of work needed to reach such state is still huge. Even with the know-how it does not seem to be a project which can be approached by a single developer or even a medium-sized company. Therefore, we believe that an editing framework, which will allow other developers to build their customised solutions on top of it, needs to be implemented. This is one of the goals that we defined for CKEditor 5, however projects like Alloy Editor prove that this is doable even with CKEditor 4. Only if it wasn’t for developers starting with a clean contentEditable as soon as they find one of the existing editors slightly disappointing, we could have been in a different place today ;).