Getting a grip on module loading order beyond trial and error
(this blog has a Korean translation here)
In the many projects I have maintained so far, sooner or later I always run into the same issue: circular module dependencies. Although there are many strategies and best practices on how to avoid circular dependencies. There is very little on how to fix them in a consistent and predictable way. Usually, people move import statements or blocks of code randomly around until “it suddenly works”. As it turns out, I am not the only one running into this problem, given the responses to this tweet:
Luckily, as I will demonstrate below, there is a consistent way in which these dependency issues can be fixed.
In this blog post we will work with an artificial application that pretty prints object trees into a YAML like format:
You can try it yourself in this codesandbox. The implementation of this app is pretty straight forward. There is a base class,
AbstractNode, that defines the contract and offers some common functionality, like
getDepth(). Next, there are two specializations,
Leaf. This works fine and dandy, but maintaining three classes in a single file is not ideal. So let’s refactor and see what happens…
Once we move each class to it’s own file, it turns out that the very same application suddenly, utterly dies, seemingly beyond repair and with a pretty vague exception: TypeError: Super expression must either be null or a function, not undefined. ¯\_(ツ)_/¯!
Yet the changes were pretty minimal, as shown below (click here to see the sandbox in it’s broken state):
The above changes are enough to break the application. Note that
Leaf are imported in
AbstractNode.js module as those classes are used by the static
The reason that the application breaks is that
AbstractNode is not yet defined when it tries to load the
Leaf class. This might be surprising, because after all, there is a proper import statement above the class definition of
Leaf. But here is what happens when the modules are being loaded:
- The module loader starts loading
AbstractNode.jsand running the module code. The thing it first encounters is a require (import) statement to
- So the module loader starts to load the
Leaf.jsfile. Which, in turn, starts by requiring
AbstractNode.jsis already being loaded, and is immediately being returned from the module cache. However, since that module did not run beyond the first line yet (the require of
Leaf), the statements introducing the
AbstractNodeclass have not yet been executed!
- So, the
Leafclass tries to extend from the
undefinedvalue, rather than a valid class. Which throws the runtime exception shown above. BOOM!
Fix attempt 1
So, it turns out that our circular dependency causes a nasty problem. However, if we look closely it is pretty easy to determine what the loading order should be:
- Load the
- Load the
Leafclass after that.
In other words, let’s define the
AbstractNode class first, and then have it require
Node. That should work, because
Node don’ t have to be known yet when defining the
AbstractNode class. As long as they are defined before
AbstractNode.from is called for the first time we should be fine. So let’s try the following change:
Turns out, there are a few problems with this solution:
First, this is ugly and doesn’t scale. In a large code base, this will result in moving imports randomly around until stuff just happens to work. Which is often only temporary, as a small refactoring or change in import statements in the future can subtly adjust the module loading order, reintroducing the problem.
Secondly, whether this works is highly dependent on the module bundler. For example, in codesandbox, when bundling our app with Parcel (or Webpack or Rollup), this solution doesn’t work. However, when running this locally with Node.js and commonJS modules this workaround might work just fine.
Avoiding the problem
So, apparently, this problem cannot be fixed easily. Could it have been avoided? The answer is yes, there are several ways to avoid the problem. First of all, we could have kept the code in a single file. As shown in our initial example, that way we can solve the problem as it gives full control over the order in which module initialization code runs.
Secondly, some people will use the above problem as argument to make statements like “One should not use classes”, or “Don’t use inheritance”. But that is an over-simplification of the problem. Although I agree that programmers often resort to inheritance too quickly, for some problems it is just perfect and might yield great benefits in terms of code structure, reuse or performance. But most importantly, this problem is not limited to class inheritance. Exactly the same problem can be introduced when having circular dependencies between module variables and functions that run during module initialization!
We could re-organize our code in such a way that we break up the
AbstractNode class into smaller pieces, so that
AbstractNode has no dependencies on
Leaf. In this sandbox the
from method has been pulled out the
AbstractNode class and put into a separate file. This does solve the problem, but now our project and API is structured differently. In large projects it might be very hard to determine how to pull this trick off, or even impossible! Imagine for example what would happen if the
Leaf in the next iteration of our app…
Bonus: an additional ugly trick I used before: return base classes from functions and leverage function hoisting to get things loaded in the right order. I’m not even sure how to explain it properly.
The internal module pattern to the rescue!
I have fought with this problem on multiple occasions across many projects A few examples include my work at Mendix, MobX, MobX-state-tree and several personal projects. At some point, a few years ago I even wrote a script to concatenate all source files and erase all import statements. A poor-mans module bundler just to get a grip on the module loading order.
However, after solving this problem a few times, a pattern appeared. One which gives full control on the module loading order, without needing to restructure the project or pulling weird hacks! This pattern works perfectly with all the tool-chains I’ve tried it on (Rollup, Webpack, Parcel, Node).
The crux of this pattern is to introduce an
internal.js file. The rules of the game are as follows:
internal.jsmodule both imports and exports everything from every local module in the project
- Every other module in the project only imports from the
internal.jsfile, and never directly from other files in the project.
index.jsfile is the main entry point and imports and exports everything from
internal.jsthat you want to expose to the outside world. Note that this step is only relevant if your are publishing a library that is consumed by others. So we skipped this step in our example.
Note that the above rules only apply to our local dependencies. External module imports are left as is. They are not involved in our circular dependency problems after all. If we apply this strategy to our demo application, our code will look like this:
When you apply this pattern for the first time, it might feel very contrived. But it has a few very important benefits!
- First of all, we solved our problem! As demonstrated here our app is happily running again.
- The reason that this solves our problem is: we now have full control over the module loading order. Whatever the import order in
internal.jsis, will be our module loading order. (You might want check the picture below, or re-read the module order explanation above to see why this is the case)
- We don’t need to apply refactorings we don’t want. Nor are we forced to use ugly tricks, like moving require statements to the bottom of the file. We don’t have to compromise the architecture, API or semantic structure of our code base.
- Bonus: import statements will become much smaller, as we will be importing stuff from less files. For example
AbstractNode.jshas only on import statement now, where it had two before.
- Bonus: with
index.js, we have a single source of truth, giving fine grained control on what we expose to the outside world.
This is how I solve circular dependency issues nowadays. It takes some initial refactoring work on import statements if you apply this to an existing project. But the process itself is dumb and straight-forward. And after that, you have full control over the module loading order, making it possible to immediately address any circular dependency issues that arise in the future.
Here are few, real-life commits of refactorings that make use of this solution:
- MobX (big change, but not impactful as it is straightforward)
- MobX-state-tree (notice how end-of-file imports were eliminated)
- Smaller personal project
So far, I never applied this pattern to big projects, only to libraries. But for big projects it should work out to just apply this techniques to certain sub-folders in your project where this problems occurs, as-if they are stand alone libraries.
Let me know if this pattern works for you as well! Also, let me know when there is a code-mod that applies this process (hint) :-)