Module organization in functional languages

art yerkes
Jun 2, 2018 · 5 min read

Having spent the past 2 1/2 years writing almost exclusively typed functional code in a variety of languages (F# and elm dominate, followed by purescript), I’ve seen a few patterns emerge in the way I divide code into modules, and I think these bear talking about, given that while moving code isn’t risky in these languages, it can be a bit of a slog to move some stuff only to find that the dependencies don’t quite all point the right way. Of course, type variables and composition can be a partial solution, but quoting a partial type downstream can also mean duplicating some type information, depending on the type system the code is in. Having a clean dependency tree where each type or function is in a natural place is really joyful when it’s achieved and possibly unlike a few others in FP, I like working in concrete types and objects.

Some elm code I’ve deployed recently is a good example of what I’ll call “mature organization” and I found myself unconsciously mimicking it in another project just recently, which led me to thinking about sharing this. This kind of organization is probably more important in elm than any other language, since the TEA “single source of truth” mandates the app state to live in precisely one data structure, barring effect managers and other exotics. Because of this, you can’t hide state items from each other behind signals or with isolated STRefs, and have to confront this dependency graph directly.

Here’s my view of the “mature organization” strategy, in dependency order (later means depends on more of the former):

*/* -- Business logic objects and algorithms go in their
own module categories. These are self contained
and don't access State, because they're more
general than a single app.
PortTypes -- Types used by ports, including an "interface"
record type containing all the app's port
functions as values. In F# or purescript, I
should substitute object or module style
interfaces to foreign code, but I've been in
practice lazier than that, instead injecting
dependencies on foreign code farther down the
stack with higher level semantics.
Types -- Most frequent "agnostic" types used, such as
currency, http message types, domain objects
Flags -- Data consumed at app startup (elm flags, parsed
command line arguments).
Ports -- A module that contains the ports in elm.
Because port values are functions, they can
be passed to views and such in a record that
allows testing code to consume surrogates.
State/Types -- Types used by and to update global state
Bus -- Messages consumed by State.update to change
the application's global state. They can be
sent by ports and view code, and run down the
application's OutMessage bus. Can use
Codecs -- A central place for json, xml, parsing code
State -- Application state available to all code
View/ -- Active UI code (update and init fns)
View/Html/ -- HTML and SVG composition, to separate it from
message handling.
View -- Main app container, CSS generation if needed
Main -- App main and initialization, combines ports,
flags, state and View.* functions to make an app.
TestFeature*-- A main module that tests a specific feature. It
doesn't use real ports unless they're being
tested in their own right, but instead has
features to simulate port and end user activity.

That’s a lot of modules, however there are great benefits to this kind of organization:

  • You always know where bus messages terminate when they originate from the extremities of the app. The single source of truth becomes the .state member of the app data structure, which can be freely changed to suit specific situations.
  • The business logic behavior of the app becomes relatively easy to test. State is just messages and data and isn’t tangled up in view specific stuff.
  • Bus messages allow decoupling of app behavior and views. The app State is totally unaware of how views use its data and any disconnection between message traffic to the UI and ideal transformations on the app State is handled in View.*.update functions that take local messages and translate this activity into bus messages.
  • View local state (such as names edited for renaming) is strictly separated from single source of truth state.
  • By keeping Types early in the tree and separate from code or values, we can have nice concrete type annotations in all values and functions without needing to move any code. We can state the types of injected functions clearly whenever a function needs to be injected to counter a potentially wrong dependency arrow.
  • By segregating Flags apart from types, we can have different main functions use different flags or use them in a different way without being a pain. The downstream code becomes more general if it uses a type called out in Types for initialization that main can pull or synthesize from Flags.
  • Having a separate ports module is a good idea, but it doesn’t gain power unless ports are used through a port interface record that can be simulated in other ways. When your app uses this kind of organization, all but the main module becomes port agnostic. Conceptually, this could be a thing in F# too, where I’ve written a lot of code, but since the composition of the app depends on an fsproj file, I haven’t bothered making all foreign code injectible to the same degree, instead opting to inject surrogate code at the project level. I haven’t written enough purescript or haskell need this kind of hot swappability yet, but it’s only a matter of time before I do.
  • It’s not necessary, but I find it useful at larger scales to have subviews separate out actual HTML or SVG composition so the view’s primary module can focus on the data it contains and how messages transform it. I define the message type in the View module next to where I’m sending the messages but the Model in the primary module next to where the data is transformed. I don’t think there’s any better or worse way of arranging these two really, but that’s what I do.

I find that a similar organization works well regardless of whether I’m using Elm, F#, or purescript.

One of the reasons I started down the road of functional programming was because of how unsatisfied I was with the world of mutability when it came to the ability to test and verify code and how much of a chore maintaining testability was. I find that setting a single dominant organization in functional code early on allows me to build in the ability to test the code without needing to exert a lot of ongoing conscious effort, and I feel that’s a very big win. An IO or Effect monad help, but aren’t required when ensuring that pure code remains pure… You can test it easily without these by just making a project without the native interface modules and see what happens, and the habit of effects as explicit data is valuable for testability regardless of what language you’re in.