XBlock Lessons: Costly Conveniences

This is the second in a series of articles on lessons learned from the development of XBlocks. This article will give an example of how complexity crept into the system because we were trying to make the XBlock interface easier to use. Please note that this article reflects my personal takeaways and opinions, and others at edX may have different views.

For a high level overview of what XBlock is, please see the first article in this series. The TLDR version: XBlock is a plugin mechanism and base class for creating a new type of content for Open edX courseware, and almost all instructional content in a course (e.g. videos, problems) are XBlocks.

Our Example: Auto-saving Field Data

The XBlock API allows you to define Fields of various types (e.g. boolean, integer, string, dict, etc.) to persist data. Every Field has a Scope which determines where and how it is persisted. For instance you could make a simple essay XBlock that defines a content scoped Field to store the question to prompt the user with, and a user_state scoped Field to store each student’s answer.

When this XBlock is instantiated in courseware for the student, the question field will get its value from the problem definition that was authored in Studio and persisted in MongoDB. The submission field will be populated from a table in MySQL.

Say you create a save_submission handler method that updates the user-scoped field that holds the user’s essay response. It might look something like this method from the Open Response Assessment Block, a simplified version of which might look like:

Notice that we didn’t have to make any explicit call to persist the new value of submission to the database. XBlock automatically detects which fields have been modified and persists them after the handler method has finished executing. This was done to make the XBlock development process a little easier. After all, if you’ve made a change to a field, you almost certainly want to persist it to the database. The entire Field mechanism shields you from having to understand where the data lives and gives the illusion that each field is just simple attribute on your XBlock object.

Implementing Auto-Save Behavior

So how do we implement this? How do we detect when a Field has changed?

The first step we can take is to override the __set__ method on the Field class. This is the method that will be invoked every time we assign a value to that field. So we implement this method to include some bookkeeping to track whether changes have been made. Great!

Unfortunately, that doesn’t get us all the way there. Mutable types can be altered without ever invoking __set__. Suppose we have a dict Field object named submitted_answers:

Changing the contents of a dict does not call __set__ since you’re just mutating the dictionary in place instead of replacing it with another instance. To address this issue, we have a boolean MUTABLE attribute on all Field types. We then override the __get__ method to mark mutable fields as dirty if we so much as return a reference to them because they might later get mutated in a way that we can’t easily detect. As we’re marking a field dirty, we also make a copy of it so that we can do a comparison later, to see if it actually changed. This mostly works, with one caveat:

In the example above, we made a copy of the dictionary and altered the number of attempts on that copy. Yet it still compared as equal to the original. The reason is because the copy function is shallow, so copied['a'] and original['a'] are now two separate references to the same nested dictionary. To safely copy an arbitrarily deep nesting of dictionaries, we need to use the more expensive deepcopy function.

Assessing the Cost

All that defensive deepcopy work has to happen the first time any mutable Field type is accessed by any XBlock. On operations that involve sequences or entire courses, that can be many thousands of times. Depending on the courseware operation, we’ve profiled deepcopy time alone to be between 5% and 15% of execution time.

The really aggravating part is that the vast majority of that copying is being done on content-scoped fields for XBlocks rendering in the LMS, and changes to those fields cannot be persisted. The LMS is not allowed to write changes to course content, and the LMS XBlock runtime will throw an error on any attempt to do so. So we are wasting all these cycles making defensive copies of Fields that the LMS never alters and wouldn’t be allowed to write even if it did alter them. But we’re doing it because Fields are implemented in the core XBlock library, and that layer of the code doesn’t understand that some runtimes will treat the content scope as read-only.

Before we contemplate adding an interface to make the Fields more runtime aware, or find another way to add hints about when field changes can or can’t take place during a handler invocation, let’s take a step back and sum up the complexity we’ve already added up to his point:

  • We have bookkeeping code and state to mark when fields are accessed.
  • We’ve put this logic in the somewhat obscure__set__ and __get__ methods.
  • Fields now have to understand a new MUTABLE attribute and potentially confusing reference-copying behavior.

Even if it’s not a lot of code, that’s a surprising amount of conceptual complexity for anyone who has to look through it for the first time. This feature has in fact caused a few bugs around default values, imports, and caching.

In exchange for this complexity, XBlock authors can make field state changes without having to call an explicit save method in their handlers. At a higher level, those developers were shielded from having to understand that their XBlock fields are backed by a database somewhere — they could treat it like a simple object. It’s important to note that the target audience for XBlock developers included course staff that did not program professionally, and so our goal was to make things as easy as possible to get running.

Lessons Learned

Auto-saving fields is complicated because it tries to be too smart at too low a level, and in a way that isn’t well supported by the language. If persistence were somehow handled at the handler level — say the student_view handler was passed a separate parameter that you could call to get and set student state — then there would be no need to build that intelligence at the individual field level. Even the low level approach might have been straightforward if Python had better support for immutable data structures (along the lines of certain functional programming languages), since immutable dictionaries would have removed the need for defensive copying and simplified the change detection code.

Yet it’s too easy to just look in hindsight and declare that this particular bit of API convenience wasn’t worth it. I think the more insidious aspect is how a seemingly small convenience grows in complexity over time. At each step, it looks like we’re plugging just one last hole in our leaky abstraction, but there’s always something else on the horizon. The convenience that presents XBlock fields as simple attributes on a regular Python object hides a basic truth, and we have to keep telling more elaborate lies to hold the abstraction together.

So that leads us to where we are today. Auto-save relies on a couple of magic methods and an understanding of mutability and copy semantics in Python. Every XBlock ever written expects auto-save behavior. How do we address its known performance issues? Do we…

  1. Add more complexity to enable hinting or some other API interaction between the XBlock core library and the runtime so that we don’t make unnecessary copies?
  2. Add the ability for authors to explicitly flag a method as doing a manual save() so we can make some optimizations in that case, but still default to the existing behavior to maintain backwards compatibility?
  3. Hope that we haven’t done a sufficiently good job of profiling and see if there are significant gains that can be made without altering behavior in any way?
  4. Relax some of the API guarantees that prevent edge cases that may not be important in real-world use (e.g. deepcopy and nested dicts) and hope that it doesn’t break too many things?
  5. Make a new version of the API that explicitly breaks backwards compatibility?

None of those are especially appealing, especially since experience has taught us that there’s likely to be another hole to plug down the line. Yet that’s where this seemingly small feature has led us over the years.

Written by

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store