Towards Maintainable Elixir: Boundaries
The previous article discussed the high-level design of Very Big Things’ projects. Today we’ll dive a bit deeper and take a look at the namespace structure. The word namespace here refers to dot-separated module names. For example, the
MySystem namespace will include the
MySystem module, as well as “sub-modules”, such as
MySystemWeb is another namespace, containing modules such as
In this article I’ll use the term boundary to refer to a module namespace. This terminology comes from the boundary library, which we use to control cross-module dependencies in our projects. In a nutshell, boundary allows us to create groups of modules, called boundaries, and manage dependencies between them.
For example, we can define one boundary for the core layer, and another for the interface layer, permitting only dependencies from the interface to the core. Furthermore, we can make modules private, so they can only be invoked from within their boundary. For example, making the repo module private will prevent direct repo usage in the interface. These rules are checked by the boundary compiler, which will emit warnings if any rule is violated. Since we check for compilation warnings on the CI, it follows that code can only be merged if it fully complies with the configured rules.
Let’s see the boundary structure in a real project that we will call XYZ in this use case. The business domain of this project is irrelevant, as we’ll focus only on the generic aspects of it. The following graph depicts our top-level boundaries and their dependencies:
The main boundaries are
Xyz (core) and
XyzWeb (interface). Together, these two boundaries contain most of the project’s code. The remaining boundaries emerged over time, and are introduced to address some practical issues. The naming scheme for top-level boundaries follows the convention proposed by Phoenix, where the suffix is appended directly to the context name, instead of using the
. separator (e.g.
XyzConfig instead of
Xyz.Config). Let’s take a quick tour through these boundaries.
XyzConfig boundary is a single module boundary that consolidates what we call the operator config. These are the system parameters that have to be provided at the target machine (e.g. staging or release), such as site’s public URL, database connection string, credentials to 3rd party services, etc. The config module wraps the access to those parameters. The client code invokes something like
XyzConfig.database_url() and the value is fetched from some source, such as OS env.
By wrapping both, core and interface parameters (e.g. database connection string and site URL), this module breaks away from our standard practice of separating the core from the interface. However, there’s a practical reason that justifies this decision. We want to have a single place where the system configuration is defined, so we can easily see all the parameters that must be provided on the target machine. Since the module contains both core- and interface-specific concerns, it shouldn’t be a part of either layer, so it’s promoted to the top-level boundary.
Somewhat controversially, we decided to group our Ecto schema modules under the same top-level boundary. We mostly avoid bundling things based on their technical properties, but we made an exception here because Ecto schemas tend to explode pretty quickly. For example, in this particular project, 50 out of 100 modules are Ecto schemas. The prevalence of schema modules becomes even more striking if we focus on the context boundary (
Xyz), which consists of only 23 modules. Having schema modules scattered around these folders would make the code navigation significantly harder.
Moreover, Ecto schemas in the context boundary lead to some strange module names, such as
Xyz.Accounts (context) and
Xyz.Account (schema), or alternatively
Xyz.Accounts.Account (schema inside the context namespace). To prevent these issues and improve the code navigating experience, we decided to consolidate schemas under a single boundary.
Ecto schemas logically belong to the core, and in fact we initially did keep them in the
Xyz.Schemas namespace. However, to ensure that no complex logic, such as changeset building or repo operations, creeps into these modules, we’ve placed schemas into a separate top-level boundary which is not allowed to depend on any other boundary.
XyzApp boundary contains the logic required for OTP application and release. This boundary was introduced to break the dependency cycle between the interface and the core. In our high-level design, we don’t want the core to depend on the interface. However, when you create a new Phoenix project, this rule is immediately violated. Namely, the Phoenix project generator places the OTP application module inside the context boundary. But the app module depends on the endpoint module from the web boundary, and so we end up with a dependency cycle. To break this cycle, we moved the app module to the top-level boundary. Our custom project generator does this automatically for all new projects.
XyzMix boundary contains all the code that is specific to our custom mix tasks, which mostly revolve around setting up the database on local dev and CI. This is the only part of the code where runtime invocation of mix functions, such as
Mix.env(), is permitted. Compile-time mix invocations are allowed everywhere.
The web boundary contains Phoenix and Absinthe-specific logic. The internal structure of this boundary follows the conventions proposed by the Phoenix project generator and the official docs. In recent projects, we’ve started experimenting with an alternative organization in the web layer, moving away from the controllers/views/templates consolidation in favor of the approach where each logical scope (e.g. account, admin, …) gets its own folder that contains the controller, view, templates, and other scope-specific files.
The core is the only boundary that is further divided into sub-boundaries, a couple of which are presented in the following diagram:
Xyz.Infra is a “sink boundary”, which means that everything else in the core depends on it. This boundary contains modules that support access to infrastructural services, such as AWS, or ActiveDirectory. The infra boundary also contains two Ecto repos. An extra repo is needed to support multitenancy via dynamic databases. The infra boundary exports these modules so they can be used from other boundaries. However, since infra is a sub-boundary, the access can only be granted to other sibling boundaries. In other words, this design prevents anyone outside of the core from directly using repos, AWS client, and other infra boundaries.
The rest of the core’s sub-boundaries handle some functional aspect of the system behaviour. For example
Xyz.Account deals with account operations (e.g. registration, login, password reset, notifications), while
Xyz.Tenant handles tenant management (e.g. create and drop tenant). There are a couple more of such boundaries which are not included here for the sake of brevity.
For details on how sub-boundaries work, you can check the Nested boundary section in the boundary docs.
The presented design essentially follows the strategy outlined in the past article, which is the separation of the core from the interface. To further distill the core and reduce the dependencies, we needed to introduce a couple of additional top-level boundaries.
Going further, the primary focus of our design process is the core. This is the code that implements the expected behavior of the system, so it’s the part that deserves the most attention. By splitting the core boundary into sub-boundaries we were able to further split the code into smaller, self-contained, mostly independent parts.
It should be noted that this structure emerged over time. We started much more modestly, stashing all core operations in the top-level
Xyz module. Once this module grew in size, we took some time to identify potential split strategies based on the existing functionality and the corresponding code, instead of vague guesses. Essentially, this is a lightweight agile approach to code design.
The boundary tool plays an important role in this process. It can automatically enforce our design constraints (e.g. calls from the core to the interface are not allowed). In addition, it assists us with further design refinements by allowing us to see the bigger picture. Reasoning about the dependencies between groups of modules is much easier than making sense of a large number of individual cross-module dependencies.