Making the World Right with Types

This article demonstrates modeling complex business requirements using type-driven development with the functional-first language F#. A github gist of the final code can be found here.


Clients hire development consultants to help them add value in a very specific problem domain. Before we can even attempt to add value, however, we must collaboratively build a representative model of the problem domain in question. Rather than building up such a universe from ether, modeling is more a process of whittling reality down so that it can be expressed computationally.

This process is referred to as domain modeling. The models we create will never exactly represent the domain they are attempting to mimic. As the saying goes, all models are wrong, but some are useful.

The question becomes, how do we build useful models? The answer is to create an algebra of the problem domain. It sounds really complicated, but algebras (there are a lot of them) are very simple at their core.

An algebra is a combination of a set of types, a set of functions defined with them, and a set of laws that interrelate the functions.¹

A basic example of an algebra is a Semigroup, which is a set of things and a single associative (order of operation doesn’t matter) binary (two things) operation on those things. Integers form a Semigroup. If you take addition as the associative binary operation you can see that (a + b) + c = a + (b + c).

Integers are the the only type in the algebra. There is only one function in the set of functions —the operation of addition. And the associative law is the only law that interrelates the function of addition. Thus, an algebra.

The idea behind creating an algebra of a business domain is that once we have defined all of the things in our domain (types), the relations between them (functions), and overall laws of the system (invariants), we can say with deterministic certainty how we expect such a system to behave.

Unexpected behavior cannot occur. If the output data is wrong it is either because your definition is wrong or the input data is wrong. Period. This is how algebras work; it is how we know that, given two integers a and b, the expression a + b will represent an integer forever until the end of time.

Let’s take a more complicated example of averaging a list of numbers. We know definitionally that the average of a list is the sum of the list divided by the length of the list. We can represent this concept in our algebra.

let sum ns = List.reduce (+) ns
let avg ns = sum ns / List.length ns

Here, we haven’t described how to average a list, but have rather described what an average is in our algebra. No unexpected outcomes can occur.

So, we’ve decided to use an algebraic approach to model our domain. How do we do that? The answer here is type-driven development.

Type-Driven Development

Type-driven development is a style of programming in which we write types first and use those types to guide the definition of functions.²

A type is a classification of a value. As we saw above, an integer is a type. So is a string. Values are constructed and those that are constructed in the same way have the same attributes. This allows you to define what can or cannot be done with an entire class of values. For example, natural numbers (1, 2, 3…) are constructed with a successor function. One is a special number, two is the successor of one, three is the successor of two, ad infinitum. Ergo, we know that every natural number except one is a successor and has a successor.

You can even group types into classes called…type classes. F# doesn’t implement type classes, but we can still leverage the concept. We saw earlier that integers form a Semigroup with addition. If + is overloaded to mean concatenation then strings also form a Semigroup.

(1 + 2) + 3 = 1 + (2 + 3)
val it : bool = true
("foo" + "bar") + "baz" = "foo" + ("bar" + "baz")
val it : bool = true

Type classes provide another layer of abstraction over values. Once we know that a type fits into a class we know that all of the laws of that class will hold.

There is a type class called a Monoid which is a Semigroup with an identity element — a value a for which a + b = b. Integers are a Monoid with 0 as the identity element for addition.

0 + 1 = 1
val it : bool true

If I were to tell you that strings are also a Monoid you know that they must also have an identity element. So, what value exists where concatenating two strings together doesn’t change the value of the non-identity value?

"" + "foo" = "foo"
val it : bool true

Businesses never rely solely on such simple types as integers and strings. And they are certainly never this abstract. But, we can build up a model using these simple types and abstractions that allows us to use the power of algebra even as the complexity increases.

For this article, we are going to create a very simple model of electricity production and distribution. First, we will use algebraic data types to construct composite types from primitive types.

Product Types

Algebraic data types are so named because they can be either product types or sum types. Again, this sounds complicated, but it’s not.

Product types are formed by ANDing other types together. Let’s say that a PowerStation is a name represented by a string AND a supply of electricity represented with an integer. In F# we can create such a PowerStation type in two ways.

// tuple
type PowerStation = string * float
// record
type PowerStation = {
Name: string
Supply: float
}

The first way to create a product type is with a tuple. Notice how we actually use the * symbol. This should hint as to why we use the term product type. If you want to know how many possible different PowerStations could exist you would multiply the possible number of strings (16 bits factorial) with the possible number of floats (64 bits.)

Tuples — ordered lists of varying types — are great for very simple types. However, they use indexing rather than labeling to access values. This forces us to remember that the string is the name and the float is the supply. This is obviously more difficult when you have types made up of two or more of the same simple types like firstName * lastName.

The second way to create a product type is with a record. The added benefit here is that we can now name values. This makes values easier to reference.

Let’s go ahead and create another type. PowerStations generate a supply of electricity and can transmit it to a TransmissionSubstation. A TransmissionSubstation receives electricity from a PowerStation and holds it until it moves to another TransmissionSubstation or is moved to a distribution substation for more general consumption. For our purposes, a TransmissionSubstation is just a name AND a supply of electricity. So, a record is the best way to model it.

type PowerStation = {
Name: string
Supply: float
}
type TransmissionSubstation = {
Name: string
Supply: float
}

All values in a record are required. So it’s now impossible to construct a PowerStation or TransmissionSubstation without a name or supply of electricity. That means never having to check for a null value.

Sum Types

Sum types represent the idea that something could be one thing OR another thing. But, both of those things have commonalities. Think of them like subsets. A real number could be an integer OR a rational OR an irrational, etc.

In our model, both PowerStations and TransmissionSubstations are physical things that don’t move. Let’s give them coordinates and then define a Station type in order to model that fact.

type PowerStation = {
Name: string
Supply: int
Coordinates: float * float
}
type TransmissionSubstation = {
Name: string
Supply: int
Coordinates: float * float
}
type Station =
| PowerStation of PowerStation
| TransmissionSubstation of TransmissionSubstation

This allows us to write functions that handle both PowerStations and TransmissionSubstations by way of treating them as Stations.

Functions

Following type-driven development, we created our types first. Now we are going to use those types to guide the creation of functions in our algebra. Our functions must take as input, and return as output, types from our algebra.

We said earlier that electricity moves from a PowerStation to a TransmissionSubstation. Let’s model that behavior with a function. We can still leverage F#’s type system because functions are types too.

A Transmission takes a PowerStation, a TransmissionSubstation, and the amount of electricity we’re moving from one to the other. It will return a tuple with a new PowerStation and new TransmissionSubstation.

type Transmit = PowerStation -> TransmissionSubstation -> float -> PowerStation * TransmissionSubstation

Of course, this isn’t an implementation of a transmission. It’s just the type signature — a mapping of function inputs and outputs. The easiest way to read it is that the value after the last arrow is the output. All other values are inputs.

But wait, we also said that electricity can be transmitted from one TransmissionSubstation to another. Maybe we should generalize our function to work with all stations.

type Transmit = Station -> Station -> float -> Station * Station

We also said that PowerStations generate electricity and that electricity can be distributed from a TransmissionSubstation. Those sound like functions.

type Generate = PowerStation -> float -> PowerStation
type Distribute = TransmissionSubstation -> float -> TransmissionSubstation

Because a Transmission is more general, it deals with Stations. Because only PowerStations can generate electricity and only TransmissionSubstations distribute electricity we use more specific inputs.

We have successfully defined the types of things we have in our domain and and the relationships between them. Most importantly, we’ve done so with code that is so readable that non-programmers can follow along. This is invaluable as we can sit with domain experts and collaboratively create our algebra.

We are now ready to add some business requirements so that we can add some real value.

Measurements

In the electricity industry amounts are measured in watts. We want to make sure that when we are generating or transmitting electricity we are always using the same units of measurement. Conveniently, F# has a measurement attribute that allows us to create measurement labels.

[<Measure>] type W

But, in this case, we have even more help from a library with standard unit names. Let’s make sure that the supply of electricity in PowerStations and TransmissionSubstations is always given in watts.

open Microsoft.FSharp.Data.UnitSystems.SI.UnitNames
type PowerStation = {
Name: string
Supply: float<watt>
Coordinates: float * float
}
type TransmissionSubstation = {
Name: string
Supply: float<watt>
Coordinates: float * float
}

Let’s now implement our generate function.

type Generate = PowerStation -> float<watt> -> PowerStation
let generate (powerStation: PowerStation) (supply: float<watt>) =
{ PowerStation with Supply = powerStation.Supply + supply }

Now, if we try to call generate without specifying the measurement of the supply we get an error at compile time. I’ve added ellipses in order to avoid continually typing out the Name and Coordinate values. They aren’t valid F# syntax. You can see the full code here.

\\ wrong
generate { ... Supply: 1000.0<watt> ... } 500.0
\\ right
generate { ... Supply: 1000.0<watt> ... } 500.0<watt>

Fulfilling Requirements with Types

Let’s say we are given a business requirement that says that PowerStations cannot have negative supply — supply has to either be a positive number of watts or nothing.

This is what we call an invariant of the algebra. And, we want to enforce that invariant through types so that the compiler will do the work of checking it at compile time.

The first thing we want to do is give Supply an explicit type.

type Supply = Supply of float<watt>

Because a float can be negative, there is nothing currently preventing us from creating a negative supply. So, we’re going to hijack the Supply constructor and enforce our invariant there.

let Supply (electricity: float<watt>) =
if electricity > 0.0<watt> then
Some (Supply electricity)
else
None

We use the Option type to return Some value or None. This will later allow us to easily pattern match against a supply and take different actions depending on whether or not supply is present.

Let’s change our PowerStation type to use Supply and create two power stations: East and West.

type PowerStation {
Name: string
Supply: Supply option
Coordinates : float * float
}
// val east : PowerStation = {Supply = Some (Supply 1000.0);}
let east = {
...
Supply = Supply 1000.0<watt>
...
}
// val west : PowerStation = {Supply = None;}
let west = {
...
Supply = Supply -1000.0<watt>
...
}

When we attempt to create a PowerStation with negative supply we get one with None supply instead. Just like we know that natural numbers must be and have a successor, we know that PowerStations cannot have negative supply because that is how they are constructed.

We have used the type system to enforce the constraint that supply cannot be negative. In dynamically typed languages we would have to write run time checks in every function that manipulates a power station’s supply. This would probably involve throwing and catching errors. Furthermore, we would have to write unit tests for every single one of those functions in order to show that the correct errors are thrown.

Here the type system and compiler do the work for us; not only that, our code is much more explicit.

We can now create a function to generate electricity that uses pattern matching.

type Generate = PowerStation -> Supply option -> PowerStation
let generate (powerStation: PowerStation) (supply: Supply option) =
match powerStation.Supply, supply with
| Some (Supply powerStationSupply), Some (Supply supply) ->
{ powerStation with Supply = (Supply (powerStationSupply + supply)) }
| None, Some (Supply supply) ->
{ powerStation with Supply = (Supply supply) }
| _, _ ->
powerStation
// val it : PowerStation = { Supply = Some (Supply 1500.0); }
generate east (Supply 500.0<watt>)

We are matching the existing powerStation.Supply and the generated supply. If the PowerStation has Some supply and the produced supply is also Some then we combine them and return a new PowerStation. If the powerStation.Supply is None and the produced supply is Some then we return a new PowerStation with the produced supply. For all other cases we just return the power station unchanged.

Just like PowerStations, the supply of electricity in a TransmissionSubstation can’t be negative. So, let’s change our TransmissionSubstation type to use our Supply constructor and implement our function to distribute electricity to both use Supply.

type TransmissionSubstation = {
Name: string
Supply: Supply option
Coordinates: float * float
}
type Distribute = TransmissionSubstation -> Supply option -> TransmissionSubstation
let distribute (substation: TransmissionSubstation) (supply: Supply option) =
match substation.Supply, supply with
| Some (Supply substationSupply), Some (Supply supply) ->
{ substation with Supply = (Supply (substationSupply - supply)) }
| None, _ ->
{ substation with Supply = None }
| _, _ ->
substation

With distribution we aren’t concerned about where the electricity goes once it leaves our domain. So, we can ignore the ramifications for the receiver of electricity if we try to distribute more electricity than a TransmissionSubstation has in its Supply. If that happens we just set substation.Supply to None with our Supply constructor.

The last function that we need to implement is transmission from one station to another. Let’s look at the type signature.

type Transmit =
Station -> Station -> Supply option -> Station * Station

The reason we created a Station type was so that we could write functions that received as inputs or returned as output a PowerStation OR TransmissionSubstation. That means that within our function we have to pattern match to unpack our types. The alternative would be to write two different functions: one to handle transmissions from power stations to transmission substations and one to handle transmissions from one transmission substation to another.

It seems like added work, but the idea is to combine like things in a way that matches the domain. PowerStations and TransmissionSubstations have many things in common — they both have a supply of electricity and the both allow transmissions.

The first layer of pattern matching destructures our stations. We are either dealing with a power station and transmission substation or two transmission substations. If we’re not we just return the original stations. Once we have unpacked our station types we want to check their supply. If the from station has enough supply then we move that amount. If it doesn’t then we move whatever supply it does have.

Let’s create some power stations and transmission substations to test our domain.

let east: PowerStation = {
Name = "East"
Supply = Supply 1000.0<watt>;
Coordinates = 0.0, 0.0
}
let west: PowerStation = {
Name = "West"
Supply = Supply 500.0<watt>;
Coordinates = 0.0, 0.0
}
let north: TransmissionSubstation = {
Name = "North";
Supply = Supply 100.0<watt>;
Coordinates = 0.0, 0.0
}
let south: TransmissionSubstation = {
Name = "South";
Supply = Supply 200.0<watt>;
Coordinates = 0.0, 0.0
}
> generate west (Supply 100.0<watt>)
> val it : PowerStation = {Name = "West";
Supply = Some (Supply 600.0);
Coordinates = (0.0, 0.0);}
> distribute north (Supply 100.0<watt>)
> val it : TransmissionSubstation = {Name = "North";
Supply = None;
Coordinates = (0.0, 0.0);}
> trasmit (PowerStation east) (TransmissionSubstation south) (Supply 100.0<watt>)
> val it : Station * Station =
(PowerStation {Name = "East";
Supply = Some (Supply 900.0);
Coordinates = (0.0, 0.0);},
TransmissionSubstation {Name = "Paola";
Supply = Some (Supply 300.0);
Coordinates = (0.0, 0.0);})
> transmit (TransmissionSubstation north) (TransmissionSubstation south) (Supply 200.0<watt>)
> val it : Station * Station =
(TransmissionSubstation {Name = "North";
Supply = None;
Coordinates = (0.0, 0.0);},
TransmissionSubstation {Name = "South";
Supply = Some (Supply 300.0);
Coordinates = (0.0, 0.0);})

In the last transmission, we requested to move more supply out of North than it currently had in supply. The result was to transfer whatever was in store and set its supply to None.

There are plenty of things we could do to continue to build out our model. Currently, PowerStations and TransmissionSubstations are exactly the same. And our transmit function practically treats them as such. Maybe TransmissionSubstations have a storage capacity and we need to make sure they have enough storage to receive a transmission.

In order to enforce this business requirement we might want to create a TransmissionSubstationWithCapacity type constructed from a TransmissionSubstation. We would then alter our transmit function to work with the new type. The compiler would stop us from ever calling transmit on a TransmissionSubstation that wasn’t guaranteed to have storage capacity. This is referred to as making illegal states unrepresentable.

Likewise, our coordinates are unbounded floats. We know that Latitude is a float between 0 and 90 plus a direction (North or South.) Longitude is a float between -180 and 180. So, we would probably want to make Latitude and Longitude types and a record type to AND them into a record.

We should let the problem domain guide us, though. Maybe we’re not using coordinates for anything other than displaying them in a summary of a Location. In that case, it isn’t worth the extra work to make them anything more than a tuple of floats.

Regardless, when we use type-driven development to model domains we follow a simple process: develop types, define functions on those types, and capture domain rules in the created types. Iterate through the process adding only the requisite complexity.

The goal is to build an algebra of the domain that gives us all of the power and confidence of a deterministic system. The result is a good night’s sleep assured that the software you’ve built works exactly as specified.



Find us on Facebook and Twitter or get in touch at thirteen23.com.