Algebraic Data Types in Scala

Shannon Barnes
5 min readMay 1, 2018

--

Often Scala developers originate from either two camps. The first camp comes from the traditional computer science background where imperative and object-oriented coding was the main emphasis. The main data type concept I was taught in my CS courses was the abstract data type, often abbreviated as ADT. Code a well-known interface that hides the internals of the implementation was heavily drilled in the curriculum.

The second camp seems to be prior Haskell programers that speak their own exotic language derived from category and type theory. Coming from the functional programming paradigm, they have their own ADT, the algebraic data type.

Having an algebraic data type in Scala simply means two things are involved:

  1. A sum type consisting of various subtypes of other sum and product types.
  2. The ability to analyze the values of the algebraic data with pattern matching.

This does not appear to be related to anything I learned in high school algebra. It turns out that any concept that can be thought of as (a) a set of objects, and (b) operators on those objects is a form of algebra.

To help to understand the meaning of sum and product, first it would be helpful to think about how many distinct values can exist in the following non-composite types.

Nothing // 0 distinct values
Unit // 1 distinct value
Boolean // 2 [true,false]
Byte // 256
Int // ~4 billion
String // A lot

Product Type

The product type in Scala are typically represented in Scala as a case class or case objector a TupleN instance. They simply contain a fixed order of fields that are associated with logical conjunctions (ANDs). You can think of them as having this field AND that field ….

The set of all possible values of a product type is its Cartesian product.

case class Student(enrolled: Boolean, age: Byte)
// Students consists of a Boolean AND Byte
// 2 * 256 = 258
type aTuple2 = (Boolean, Unit)
// aTuple2 consists of a Boolean AND Unit
// 2 * 1 = 2
type aTuple3 = (String, Boolean, Nothing)
// aTuple3 consists of a String AND Boolean and Nothing
// A lot * 2 * 0 = 0

Sum Type

The sum type in Scala is implemented by way of inheritance. The values of the sum type can are subtypes associated with logical disjunctions (ORs), where the possible values are this subtype OR that subtype …

Let’s take the following example:

sealed trait Color 
case object Red extends Color
case object Blue extends Color
case object Green extends Color
// Color can be Red OR Blue OR Green
// 1 + 1 + 1 = 3 distinct possible values

Color is a sum type with three possible values each represented by a subtype (Red, Blue, Green)

Now lets bring it all together and create a hybrid type (both sum and product) using a common ADT in Scala, the Option:

sealed trait Option[+A]  
case object None extends Option[Nothing]
case class Some[A](a: A) extends Option[A]
type TwoBooleansOrNone = Option[(Boolean, Boolean)]
// BooleanOrNone can be None OR Some(Boolean AND Boolean)
// 1 + (2 * 2) = 5 distinct possible values

Here we create a sum type from Option that can either be None, or a Some containing a tuple. The tuple is a product that has two booleans. This gives us a total of five distinct possible values.

Pattern Matching

Pattern matching allows you to decompose a given ADT, binding the values it was constructed from to variables. Scala does by using an extractor method under the hood. The extractor extracts the fields from our product type. Pattern matching can also assert the subtype from our sum type without the need of an isInstanceOf operator.

Case classes are special because Scala automatically creates a companion object for them: a singleton object that contains not only an apply method for creating new instances of the case class, but also an unapply method – the method that needs to be implemented by an object in order for it to be an extractor.

Below is an example of pattern matching with a function that returns true if our TwoBooleansOrNone contains two true values

def mustBeBothTrue(in: TwoBooleansOrNone): Boolean = in match {
case None => false
case Some((true, true)) => true
case Some((false, true)) => false
case Some((true, false)) => false
// We are missing Some((false, false)) in our match
}

When we define an algebraic data type using sealed traits we allow the compiler to perform exhaustiveness checking. This means the compiler will shout at us if we miss a case condition in our pattern check.

When I try to compile the mustBeBothTrue function the Scala compiler knows that there are five distinct possible values that could be assigned to a TwoBooleansOrNone instance and it will give me a warning.

warning: match may not be exhaustive.
It would fail on the following input: Some(_)

I was missing the following case statement in order to catch the fifth and missing possible value:

case Some((false, false)) => false

We can also get an exhaustive search with the following shorthand:

def mustBeBothTrue(in: TwoBooleansOrNone): Boolean = in match {
case Some((true, true)) => true
case _ => false //all other possible cases are false
}

Sample ADT that models JSON

Just for illusion we can make a sample ADT that represents JSON

sealed trait Json
case object JsonNull extends Json
case class JsonString(s: String) extends Json
case class JsonInt(num: Int) extends Json
case class JsonBool(value: Boolean) extends Json
case class JsonField(name: String, value: JValue) extends Json
case class JsonObject(obj: List[JField]) extends Json
case class JsonArray(arr: List[JValue]) extends Json
object Json {def parse(rawJson: String): Either[String, Json] = {...}

def children(json: Json): List[Json] = json match {
case JObject(l) => l
case JArray(l) => l
case JField(n, v) => List(v)
case _ => Nil
}
def firstChild(json: Json): Option[Json] = json match {
case JsonObject(h::t) => Some(head)
case JsonArray(h::t) => Some(h)
case JsonField(n, v) => Some(v)
case _ => None //This will catch empty JsonObject(Nil)
}
}

We create a Json ADT with a companion object that with a non-implemented function can parse a raw JSON string and return an Either, another ADT that can be either a Json result or an error String. Also in our companion object, we provide some useful functions to operate over our Json ADT using pattern matching, a function to return a list of child elements (if it has any) and a function to return the first child (again, if it has one).

What is interesting about the firstChild function, is that the first two case statements are able to ignore when a JsonObject or JsonArray has an empty list. That is because List is also an ADT and we match against lists that only have a head and tail. That list example is using infix extractor notation that is common for pattern matching with Lists, it could have also been written as ::(head, tail).

--

--