Copy from MS Word, Paste into a Rich Text WYSIWYG editor

Gabe Sumner
5 min readApr 26, 2011

--

This title will send chills up the spine of web developers & content authors everywhere. Web Developers fear the bloated markup caused by this action. Content authors fear the difficulty of mixing their favorite authoring environment with their CMS’s editor.

Why is copy & paste a problem?

The problem isn't copy & paste. The problem is WHAT is being copied & pasted. Plain text (content without any styling) is completely safe to paste into a Rich Text editor. However, rich text content consists of 1) text and 2) styling.

This sentence has a bolded word.

In this example, my Rich Text editor added hidden markup around the word “bolded”. This markup instructs the web browser to apply special styling. If this content is copied & pasted into another program then this hidden styling is included.

And despite what you think, this isn't what you want…

MS Word is not good at creating web sites

There are plenty of choices for accessing the web (PC, Mac, phones, iPad, IE, Chrome, Firefox, Opera, etc). Ideally, a web site needs to function reliably in all of these environments.

To address this challenge, web developers establish styling for the entire web site. This styling, in addition to creating a consistent visual experience, enables the web site to be adapted for each device or browser.

By importing styling from MS Word authors are circumventing their web site’s styling.

As a result, styling that worked wonderfully in one environment (MS Word) will behave very poorly in another environment (your web site). Even if it looks okay during publishing, this imported styling will create insidious long-term issues for the web site.

What’s the solution to copy & paste?

As described above, the embedded styling (found in copy & pasted content) is the problem. Consequently, the solution is simple and obvious:

Copy the text, but remove the styling.

Towards this end, special ‘paste’ buttons are popular with many Rich Text editors:

However, this is a ridiculous waste of toolbar real estate. The Rich Text editor should automatically clean pasted content. The alternative is educating end-users regarding which of these 8 buttons they should click.

All major Rich Text solutions (TinyMCE, CKEditor, RadEditor) have options for automatically cleaning pasted rich text content.

This solution has a downside though:

When styling is removed the content will look radically different. This requires content authors to reapply missing styling within the Rich Text editor. By doing this, authors are replacing MS Word styling with web friendly styling.

This solution is unrealistic, content authors will revolt

Everything I've written is well known to developers. Furthermore, features for automatically detecting and cleaning dirty content are widely available.

However, these features are often disabled in the face of user revolt.

It’s normal for content authors to react negatively when their nicely formatted MS Word document turns to garbage in the CMS. These reactions are given credibility since their actions worked fine in another CMS or Rich Text editor.

So…just disable the feature that strips MS Word styling and make them happy…

This will eventually ruin the web site, but the customer is always right. Right?

Is Clippy the solution to our problems?

This post has now come full circle and we’re no closer to a real-world solution:

  1. Developers remove pasted styling to protect the web site
  2. Authors create content in their preferred writing environment.
  3. Authors want to move this content to the web site.
  4. Copy & paste is a logical choice.
  5. Authors are confused when everything goes to hell.
  6. Authors complain to developers.
  7. Developers allow pasted styling to make authors stop complaining.

However, as I look over this cascade of events, I see an opportunity for intervention at stage #5. Education (as much as technology) is the problem.

To address this, here is what I propose:

I was chatting with a colleague about this dilemma and showed him this mockup. He replied with “you want Clippy” and then smiled. This reply severely shook my faith in my proposal. I certainly have no desire to interact with Clippy…

However, there is a lot I like about this proposal:

  • It doesn't involve an animated character
  • It empowers authors to make their own choice
  • It educates authors about the consequences
  • It only displays when relevant
  • It contains useful information
  • It will go away

None of these things could be said about Clippy.

If you build it, they will come!

Everything described happens because authors avoid writing content in their CMS’s Rich Text editor. The hacky style stripping & modal windows are completely unnecessary if authors simply type the content in the CMS.

Towards that end, I’m very interested in creating an attractive web-based authoring experience. Why are authors avoiding web-based authoring tools in favor of off-line tools? How can we change this behavior?

There are some big players (Google Documents, Word Live) that are also wrestling with this challenge. This topic is covered in another post.

Originally published at gabesumner.com on April 26, 2011.

--

--

Gabe Sumner

Coder, Marketer, Storyteller, Hiker. Working as a Demo Engineer at Salesforce. Opinions are my own.