Lessons Learned: Migrating Fill in the Blank Assessments from CodeMirror to Monaco

Shirley Lin
Codecademy Engineering
8 min readJan 29, 2021
A Fill in the Blank Assessment on Codecademy

At Codecademy, we are committed to building the best learning experience possible. A large part of this is providing an in-browser, no-setup coding experience for our learners that gives them much of the same feeling as coding locally in an IDE.

To accomplish this, we leverage Microsoft’s Monaco Editor project. Fun fact — If you use VSCode as your IDE, Monaco is the text editor that powers that!

via code.visualstudio.com

But it didn’t always used to be that way! In fact, about a year ago, we were using CodeMirror for editor needs. We switched to Monaco because of its rich features, which include:

  • better accessibility support (our long term goal is to provide a fully accessible editor experience)
  • code diffing (which is now integrated into our product, where we use it to show the difference between your code and solution code)
  • Intellisense, snippets, color themes, and others that we haven’t started to integrate into Codecademy yet

By late 2020, the only remaining piece of the codebase that was still using CodeMirror was what we call CodeBlocks. CodeBlocks are used throughout the site to display code snippets outside of the editor experience. For instance, they may appear on cheatsheets or in articles. CodeMirror was powering the syntax highlighting of these blocks.

Act I: The Migration

Syntax highlighting — it seems simple, right? Moving from CodeMirror’s syntax highlighting API to Monaco’s API wasn’t too complicated, as the interface is fairly similar. Here is what Monaco’s colorizer API looks like:

const colorized = await monaco.editor.colorize(text, language, {});

The variable colorized now contains HTML with classnames that link to Monaco’s theming (which plugs in CSS styles) that we provide to our CodeBlock component.

There was one place where we ran into issues, though — Fill in the Blank questions. On Codecademy, we use CodeBlocks in quizzes to ask our learners questions about code to reinforce what they’ve learned. This is what that looks like:

The tricky part here is that:

  1. We need our code to be syntax highlighted (through CodeMirror or Monaco).
  2. We also need to know where our blanks that users fill in should go.
  3. We need find these blanks and replace them a different, interactive React Component that controls all the special behavior that the blanks have, such as drag & drop and clearing the blank.

For our curriculum developers that create these questions, they are simply writing a short code snippet. In order to turn that code snippet into an interactive Fill in the Blank question, we need to have some sort of agreement that we both use to know where the blanks should go.

To accomplish this, we use a special delimiter. In this case, our delimiter is a special word, or sequence of characters, that we all decide means “this is where a blank goes”. When we run the code snippet through our syntax highlighter, though, we have to be careful that the syntax highlighting doesn’t recognize the delimiter as a reserved word (like const ), or gets tripped up by the syntax of it (like __~BLANK~__ ). The easiest way to do this is to use a simple word, so that the highlighting recognizes it as a variable name.

In the CodeMirror world, we used the string, fitbblank. For example, this is what a raw Fill in the Blank question might look like:

p + fitbblank (fitbblank = fitbblank, fitbblank = fitbblank)

When properly rendered, each fitbblank will map to an interactive blank. When switching over to Monaco, though, we ran into an issue!

what happened to that last blank??

Occasionally, a fitbblank would not correctly be replaced with the Blank component and render the quiz unusable. The colorized code that Monaco would return would look something like this:

<span>... = fitbb</span><span>lank</span>

What made this particularly hard to track down was that this didn’t happen in the majority of quizzes. Eventually, it was narrowed down to the fact that For performance reasons, Monaco chops tokens apart if they are longer than 50 characters as it causes performance issues. A token’s definition may vary by programming language, but in this case the entire line is one token.

Lesson Learned: With a package that you don’t own or fully understand, unexpected behavior can (and a lot of time, will!) happen.

Act II: The Domino Falls

Once the root cause was identified, the challenge of fixing it started. Our first thought was to see if we could get Monaco to play by our rules. Unfortunately, the token length before splitting is currently not an option that is exposed to users of the package. For most cases, this doesn’t cause issues with using their syntax highlighting implementation — except in this rare case where we need to ensure that our blank delimiter does not get chopped up! One option was to submit an issue to Monaco to expose the option of turning the chopping off (or allowing adjustment of length). While this is the ideal fix, the timeline of this called for an immediate fix first.

Upon thinking about this problem, we realized that a single character delimiter would be the only delimiter that would ensure it would not be chopped by Monaco. But what character would suffice? We couldn’t use something that might be used by our curriculum developers legitimately, e.g. & or * , as that would cause any use of that character to be turned into a blank.

Lesson Learned: Technical decisions aren’t always purely technical. Oftentimes, they involve balancing other team’s needs, future projects, deadlines, and a number of other factors.

We could’ve chosen a fairly obscure character like ë, but what would happen in the future if we translated our content into a language that uses that character? What would happen when a new curriculum developer joins and unknowingly uses that character?

Eventually, we decided that a very special character needed to be used here. Something that is still a single character, but wouldn’t accidentally be used by curriculum developers. That character was an emoji. 🁡, to be exact. There wasn’t an especially detailed reasoning on why that particular emoji was chosen, other than that we felt like it would be unlikely to be used in curriculum, and that we just liked it.

Side note: it was even trickier picking an emoji to use because Codecademy offers a course in Emojicode (which you should definitely take, by the way!).

With this new delimiter in place, the above broken quiz was now fixed. Unfortunately, though, that was not the end. With using this obscure character, we unknowingly introduced another issue:

That ?? should be a blank.

Act III: The Replacement

� is called the Unicode Replacement Character, and you can read more about it here. It seems like while 🁡 was a clever trick that could’ve solved our issue, Monaco was most likely running into some sort of encoding issue. One example is this code snippet:

const a = { name: 'Example' };
if (a.🁡 === 'Examples') return true;

In the above example, the quiz should display a blank at the domino. Instead, we were getting an encoding issue, agin rendering the blank unusable. After playing around with a few other delimiter options, we ended up using � as our delimiter — due to some of its special properties and place in the Unicode spec.

The Unicode Replacement Character is not characterized as an ‘emoji’. Instead, it is a special character and was added way back in Unicode 1.1.0 (June 1993). For reference, Unicode is now on version 13.0.0. The Unicode Replacement character is also included in the Basic Multilingual Plane, while emojis and many newer special symbols are included in what is known as Supplementary Planes.

With the delimiter sorted out, there’s one final piece to touch on. There were many mistakes and known pitfalls discovered during this process, but only a very small group of engineers worked on this piece of code. Even with sharing this through knowledge-sharing channels, it’s very likely that engineers not actively involved in this part of the codebase (or even engineers who are!) will not remember the details in 6 months.

Lesson Learned: Leave clear and helpful documentation behind you for the next person!

As an engineer, learning these pitfalls and engineering a solution to them is only half of the problem. The other half of the problem is to pass on that knowledge and ensure that the next person does not inadvertently make the same mistakes that you do. To combat this, we left a detailed comment above the relevant code explaining why we use this delimiter, and linked out to more detailed documentation that explains all the known problems and history behind why this piece of code looks the way it does.

The comment reads as follows:

/*** This used to be 'fitbblank'. Monaco colorization was occasionally* breaking up the string if the line was too long,* causing the <span class="fitb-locator"> replacement to fail.* by using a single character, monaco will not break up this string.* This is the replacement character. (https://en.wikipedia.org/wiki/Specials_(Unicode_block)* This means that curriculum can not use this character.** If you would like to change this, please look here first: <internal notion link to further documentation>*/

Wrap-up & TLDR;

Even seemingly small changes can snowball into bugs and headaches, especially when it involves code that you don’t own. Software is constant iteration, learning, and knowledge-sharing. Mistakes are unavoidable, but the slog through, it’s important to reflect and understand what you have learned from it.

These are three lessons we learned while transitioning our Syntax Highlighting from Monaco to CodeMirror.

  1. With a package that you don’t own or fully understand, unexpected behavior can (and a lot of time, will!) happen.
  2. Technical decisions aren’t always purely technical. Oftentimes, they involve balancing other team’s needs, future projects, deadlines, and a number of other factors.
  3. Leave clear and helpful documentation behind you for the next person!

--

--

Shirley Lin
Codecademy Engineering

software engineer @codecademy, prev @datacamp, @grailed