Scala 3: Intersection and Union Types
Two new features in Scala 3’s type system are intersection and union types. They embrace set theory more explicitly to improve the soundness of the type system, where a type can be considered a set and the instances of it are the members of the set.
December 29, 2020 and January 7, 2021. Fixed mistakes in the discussion of subtyping for intersection types and the distribution laws. Thanks to James Warren , Sean Kim, and Pascal Mengelt for catching my mistakes.
June 29, 2023. Fixed mistakes in the discussion of covariance and contravariance. Thanks to Ivan Zalentsov for catching the errors.
For an even more concise summary of most of the changes in Scala 3, see my new Scala 3 Highlights page. Note in particular the different examples I use there for intersection and union types.
Intersection Types
Let’s start with some definitions we’ll use as examples:
All the types derive from M
and all define or override a method m
, including calls to the parent’s m
. Let’s declare some instances in the Scala 3 REPL. If I used curly braces instead of significant indentation for the previous type definitions, they would work for Scala 2 as well. The following code works for both Scala 2 and 3:
I’ll come back to the assertions and calls to m
below. Note the types for c12
and c21
. Both are intersection types. In Scala 2, c12
would have the type C with T1 with T2
. Note that you can’t write val c12 = new C & T1 & T2
; the extends
and with
keywords are still used.
Going back to set theory, what does C & T1 & T2
actually mean? A valid instance, like c12
, must be a member of the set of all instances for type C
and a member of the set T1
and also of T2
. This works like it always does in object-oriented programs. A variable of type C
can point to c12
. So can a variable of type T1
, etc. In fact, all of the following are valid:
Now here’s why I added the m
methods and invocations. In set theory, the intersection commutes, A & B == B & A
. The same is true for Scala 3 intersection types. c12
is considered to have the same type as c21
. All the following permutations are valid:
However, the assertions above illustrate that the order of invocation of super.m
depends on the order the traits were mixed in. It’s roughly right to left preference. The full algorithm for resolving precedence over a hierarchy is called linearization and Scala 2 used it, as well. I discuss it in the book, but won’t go into more details here. The important point is that the results returned by c12.m
differ from the results of c21.m
. Despite their type equivalence, they don’t necessarily behave the same way.
Intersection types are also associative: (A & B) & C == A & (B & C)
.
What about subtyping? If A <: B
, then A & A2 <: B
. for any A2
. To be clear, this means (A & A2) <: B
. Note that subtyping A <: B
means that all of A
’s instances are part of B
’s set of instances. The intersection with A2
reduces this subset further, but doesn’t remove any of the instances from B
’s set.
Similarly, if A <: B1
and A <: B2
, then A <: B1 & B2
. Think about the examples above to see why this works.
What about covariant subtyping? Suppose that C[A]
is covariant in A
(like Seq[A]
, for example). In this case, C[A & B] <: C[A] & C[B]
. Consider this example:
You can see this by thinking about set membership. Since any T1 & T2
element is also a T1
, then list1
should make sense. Both c12
and c21
are subtypes of T1
. Similarly for list2
. list3
takes a bit more thought, but again this is the union of two sets, so we’re saying that a sequence assigned to list3
must only have elements that are instances of T1
and instances of T2
, which is true for both c12
and c21
.
Intersection types expand on and improve the formalism of the familiar mixin model from Scala 2. You still use extends
and with
to construct types but the actual type is an intersection type.
Union Types
Union types are closer to the Either[A,B]
type.
Consider the following definitions, which use a union type in the idiomatic way that Either[A,B]
is used for returning errors on the left or successful results on the right:
As the comment says, if you declare a variable that’s initialized with a sequence of error
and result
, but you don’t provide a type annotation, Scala infers Seq[Object]
, the normal OO behavior we’re accustomed to. (It’s a quirk of the current Scala 3 implementation that the inferred type uses Object
instead of AnyRef
; they are essentially equivalent.) However, you can now explicitly annotate seq
to say the sequence must contain instances of Good
or Bad
. This is better than Object
or AnyRef
, because it’s more specific.
How do you work with these values?
In Scala 2, work
would return Either[Bad|Good]
and the conditional would return Left(Bad(...))
or Right(Good(i))
.
Note that you’ll typically need to use pattern matching to determine what you actually have. You can do this with Either
, too, but Either
also gives you operations like map
, flatMap
, etc. which union types don’t provide. However, on the plus side, union types aren’t limited to just two types, so they will find broader use than just as an error handling idiom.
Returning to set theory, note that work
says it will return an instance from the set of all Good
instances or from all Bad
instances.
Like intersection types, union types commute, so Good | Bad
is equivalent to Bad | Good
and they are associative, A | ( B | C) == (A | B) | C
.
Subtyping also works similarly; A <: A|B
and if A <: C
and B <: C
, then A|B <: C
.
Distributive Law for Intersection and Union Types
You might recall the distributive law for multiplication and addition:
A * (B + C) == (A * B) + (A * C).
Union and intersection types also obey distributive laws for booleans:
A & (B | C) == (A & B) | (A & C)
and A | (B & C) == (A | B) & (A | C)
Study the following declarations and convince yourself they are true, using concrete examples as needed:
Covariance vs. Contravariance
We saw above how intersection types work with covariant parameterized types, like Seq[A]
. What about union types? First, let’s be clear about whether union types are subtypes or supertypes:
For t123a
, of type T1 | T2 | T3
, if we can assign a T1
instance to it, then T1 | T2 | T3
is a supertype of T1
and similarly for T2
and T3
.
What about sequences?
Union types are covariant. We can assign t1s
of type Seq[T1]
to t123s1
of type Seq[T1 | T2 | T3]
. This works because t123s1
can accept a sequence holding instances of T1
, T2
, and/or T3
, but we only assign to it as sequence of T1
instances. Similarly for t123s2
and t123s3
.
We can’t go the other direction. If you look at the contents of t123s
, it should be clear why t1s1
, t2s1
, and t3s1
can’t work. They declare that they will only ever hold instances of T1
, T2
, and T3
, respectively, but clearly that’s not true of t123s
.
What about contravariant parameterized types?? Functions are our best examples of types that are contravariant in some of the type parameters. In this case, it’s the types for the function arguments, e.g., Function2[-A1, -A2, +R] == (A1, A2) => R
. (I explain in the book why these types must be contravariant. It has to do with the Liskov Substitution Principle.)
So, consider the following… carefully:
The f123a
method does the pattern matching we discussed above, returning a String
.
Here’s the brain bending part; the type of f123a
is identical to f123b
. The rest of the example just proves it works. (Again, if it compiles, it must be true, right??)
Here’s where it helps to think of a set as a constraint on what’s allowed to be a member. The type of f123a
says “I am able to accept an instance of either one of the three types, then I’ll return a string.” The signature of f123b
says “I’m very particular; the only function you can assign to me must be able to accept a T1
and return a String
, and accept a T2
and return a String
, and accept a T3
and return a String
. The only kind of function we can write like that, which is a single function satisfying these constraints, is a function that takes an argument of type T1 | T2 | T3
, meaning it can handle instances of any of these three types.
Final Thoughts
We ended on a rather heavy note, but I hope most of this post made sense. Basing intersection and union types on set theory makes the type system more sound. There will be fewer “divergent expansions” that sometimes happened with Scala 2. I think they also provide more options for typing, yet remain mostly intuitive, when you recall the basics of set theory (or at least work through some concrete examples.)
I’ll continue our tour of new type system features in several more posts.
For an even more concise summary of most of the notable changes in Scala 3, see my new Scala 3 Highlights page.
You can start reading the rough draft of Programming Scala, Third Edition on the O’Reilly Learning Platform. The first half or so of the chapters are available. I am refining them still, but any feedback is welcome!